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SUMMARY 


In  this  project,  we  sought  to  understand  how  users  and  organizations  balance  privacy 
needs  when  using  personal  devices  for  both  work  and  personal  purposes —  that  is, 
“Bring  Your  Own  Device”  (BYOD)  environments.  Our  goal  was  to  better  understand 
the  challenges  users  and  organizations  faced  so  that  we  could  develop  a  better  privacy 
management  system  that  addresses  their  needs.  Specifically,  we  performed  qualitative 
interviews  among  “extreme  users”  (those  in  environments  who  have  particularly 
complicated  needs)  to  answer  the  following  questions: 

•  What  do  users  want  control  over  with  respect  to  privacy  in  an 
organization? 

•  How  should  these  controls  be  implemented? 

•  How  can  we  build  systems  to  make  automated  decisions  in  order  to 
reduce  user  burden  and  facilitate  better  choices  in-line  with  users’ 
expectations? 

In  answering  these  questions,  we  interviewed  15  employees  working  in  law 
enforcement  organizations  (i.e.,  the  California  Department  of  Justice  and  Los  Angeles 
District  Attorney’s  offices).  Interviewees  were  in  a  range  of  roles  with  very  particular 
privacy  needs  (e.g.,  prosecutors,  investigators,  and  administrators).  We  found  that  users 
had  sophisticated  privacy  needs  that  were  not  being  met,  particularly  surrounding  the 
access  to  personal  information,  location  data,  photos/media,  and  contact  information, 
when  using  their  smartphones  in  the  workplace. 

In  addressing  these  users’  needs,  we  developed  a  modified  version  of  the  Android 
platform  that  allows  users  to  control  how  information  is  collected  by  third-party 
applications.  The  heart  of  our  system  is  a  trained  classifier,  which  makes  real-time 
decisions  about  whether  an  application  should  have  access  to  certain  protected  data. 
Additionally,  we  provided  specific  functionality  to  address  the  areas  that  users  found 
most  problematic: 

•  For  contacts,  we  provided  a  separate  public/private  primitive,  which 
allows  users  to  prevent  applications  from  accessing  professional  or 
personal  contacts,  when  appropriate. 

•  When  denying  access  to  certain  data  types,  our  system  instead 
provides  “fake”  or  less  granular  data,  which  allows  privacy  to  be 
preserved,  while  limiting  negative  impacts  on  application 
functionality. 

•  For  access  to  photos/media,  we  developed  filters  to  identify  and 
optionally  remove  potentially-sensitive  features,  before  that  media 
is  shared  with  applications,  in  accordance  with  the  end  users’ 
wishes. 

We  performed  an  initial  field  study  using  a  sample  of  131  Android  users.  We  applied 
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the  Experience  Sampling  Method  (ESM)  [11],  wherein  we  periodically  prompted 
participants  about  third-party  applications’  recent  accesses  to  protected  data.  The 
prompts  asked  participants  whether  they  would  have  allowed  or  denied  the  access,  if 
given  the  choice.  Alongside  these  prompts,  we  collected  contextual  data  about  what 
applications  had  accessed  data,  the  types  of  data,  and  other  situational  information.  We 
used  the  responses  to  these  prompts  to  train  a  classifier  using  the  contextual  data  as  a 
feature  set.  We  showed  that  this  classifier  was  able  to  make  much  better  decisions  about 
access  to  sensitive  data,  reducing  the  error  rate  over  the  existing  system  by  four-fold. 

In  parallel  to  this  work,  we  developed  a  user  interface.  The  theory  behind  the  user 
interface  is  that  if  machine-learning  is  being  used  to  automatically  infer  users’  privacy 
preferences  and  making  decisions  based  on  those  inferences,  an  interface  is  needed  to 
show  users  what  decisions  have  been  previously  made.  This  serves  two  purposes: 
transparency  and  control.  In  addition  to  showing  the  users  what  decisions  have  been 
made,  it  also  allows  them  to  alter  those  decisions,  thereby  retraining  the  classifier  (and 
further  reducing  its  error  rate).  We  evaluated  this  design  using  an  online  study,  which 
showed  it  to  be  usable. 

Einally,  we  implemented  both  the  classifier  and  the  user  interface  into  a  working 
version  of  the  Android  platform  and  performed  another  field  study  to  validate  our 
designs.  We  gave  instrumented  phones  running  our  system  to  37  participants  who  used 
it  for  a  one- week  period.  We  corroborated  our  previous  findings,  in  that  error  rates  were 
reduced  by  50-75%  over  the  privacy  management  system. 

In  addition  to  our  working  system,  the  result  of  this  work  were  three  academic 
publications  [18,  19,  17],  and  an  additional  publication  in  submission.  These 
publications  and  the  publication  in  submission  address  the  research  questions  above, 
covering  findings  from  our  interviews  as  well  as  innovations  in  the  user  experience, 
and  the  technology  to  manage  privacy  requests  at  the  OS-level,  as  well  as 
improvements  and  integration  of  our  classifier. 

These  results  demonstrate  that  when  provided  the  technology  we  have  developed,  end- 
users  can  effectively  manage  complex  privacy  decisions  and  trust  the  system  to  handle 
their  private  information  in-line  with  their  expectations.  The  implications  of  this  work 
are  that  BYOD  solutions  that  are  end-user  controlled  can  potentially  increase  adoption 
of  these  systems  while  simultaneously  addressing  joint  user-organizational  concerns 
around  privacy  and  information  security. 

2  INTRODUCTION 

Mobile  phones  are  increasingly  used  as  general  purpose  computers,  with  users  taking 
advantage  of  a  great  number  of  apps  to  help  with  communication,  productivity, 
organization,  and  diversion.  Erequently  these  apps  are  nominally  free,  but  the 
developers  make  a  profit  through  the  collection  of  sensitive  user  data.  Eor  example, 
Google  Play’s  current  top  two  flashlight  apps  (i.e.,  apps  that  turn  on  and  off  a  phone’s 
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camera  flash)  require  access  to  the  user’s  location,  and  previously  requested  address 
book  contacts.  This  access  to  private  data  is  not  needed  to  actually  operate  the  app 
itself,  but  instead  is  only  used  for  monetizationpurposes. 

These  resources  are  highly  sensitive.  In  interviews  we  conducted,  we  found  that  users, 
particularly  those  in  law  enforcement  professions  find  the  idea  of  delivering  your  entire 
address  book  to  a  third-party  entity  to  be  unacceptable.  Such  users  are  not  alone.  In  fact, 
both  Android  and  iOS  provide  a  permission  system  as  a  means  to  protect  users’  private 
data.  Arbitrary  apps  cannot  simply  access  all  user  data  they  must  first  request  access 
and  the  user  then  grants  it.  This  is  the  application  of  a  fundamental  principle  in  security: 
the  principle  of  least  privilege.  This  means  that  apps  should  only  be  given  the  power 
and  capabilities  that  they  need  to  work  and  nothing  more.  This  mitigates  the  damage 
caused  by  malicious  behavior,  both  accidental  and  intentional. 

Originally,  Android  permissions  were  presented  as  install-time  ultimatums  that  were 
unsuccessful  at  achieving  their  goals  [9,  6].  More  recently.  Android  started  using  an 
ask-on-first-use  (AOFU)  model.  In  AOFU,  the  user  is  explicitly  asked —  at  runtime — 
whether  to  grant  or  deny  a  permission  via  a  dialog  box,  the  firsttime  an  app  attempts 
to  access  the  protected  resource.  This  approach  gives  the  user  a  little  more  contextual 
information:  e.g.,  it  may  be  curious  that  a  text  messaging  app  needs  to  use  the 
microphone,  but  knowing  that  the  request  occurred  after  the  user  tried  to  use  a  speech- 
to-text  feature  clarifies  the  likely  rationale. 

By  design,  however,  AOFU  takes  the  user’s  decision  at  one  moment  and  then  uses  it  in 
perpetuity  for  all  future  requests  from  that  app  for  that  permission,  unless  the  user 
navigates  several  layers  of  settings  to  change  it.  During  the  course  of  this  project,  we 
showed  that  this  method  is  highly  error  prone  [19]:  it  mis-predicts  the  users’ 
preferences  resulting  in  privacy  violations.  This  is  because  AOFU  is  not  a  correct 
model  for  user  behavior  and  decision-making  [18,  19].  AOFU  fails  to  account  for  the 
contexts  in  which  future  requests  may  arise.  Users  are  nuanced  and  they  vary  their 
decisions  based  on  a  variety  of  factors,  such  as  the  visibility  of  the  requesting 
application  (i.e.,  whether  it  was  actively  in  use  when  it  requested  a  permission),  what 
the  user  was  actually  doing  at  the  time,  as  well  as  a  variety  of  other  factors. 

As  part  of  this  project,  we  implemented  and  evaluated  the  usability  of  a  novel  mobile 
privacy  management  system  that  builds  heavily  on  our  prior  theoretical  work,  as  well 
as  the  interviews  with  “extreme  users” — those  with  very  serious  privacy  concerns  when 
using  their  mobile  devices  in  professional  environments —  that  we  conducted  to  better 
understand  the  problems  that  users  face.  To  resolve  the  longstanding  challenges  of 
mobile  privacy  management,  we  applied  machine-learning  (ML)  to  dynamically 
manage  app  permissions,  and  then  we  proposed  a  user  interface  design  to  help  users 
manage  that  system  [17].  This  work  applies  Nissenbaum’s  theory  of  Privacy  as 
Contextual  Integrity  [14].  We  then  implemented  these  systems  on  the  Android  platform 
and  performed  a  field  study  to  evaluate  their  effectiveness  at  aligning  app  privacy 
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behaviors  with  users’  expectations.  The  machine-learning  (ML)  model  runs  entirely  on 
the  device  and  uses  infrequent  user  prompts  to  retrain  and  improve  its  accuracy  over 
time.  When  the  ML  model  makes  a  mistake,  the  user  interface  is  available  to  support 
the  user  in  reviewing  and  modifying  privacy  decisions,  thereby  retraining  the  ML. 

We  performed  a  37 -person  field  study  to  validate  our  new  privacy  management  system, 
measuring  its  efficacy  and  how  it  interacted  with  participants  and  third-party  apps.  We 
issued  each  participant  a  smartphone  running  a  custom  Android  OS  with  our 
permission  system  that  used  an  online  classifier,  which  participants  used  as  their 
primary  phones  for  a  one- week  study  period.  This  produced  real-world  usage  data  from 
253  unique  apps,  which  corresponded  to  more  than  1,100  permission  decisions. 
Overall,  participants  denied  24%  of  permission  requests.  Our  data  show  that  AOFU 
matched  participant  privacy  preferences  only  80%  of  the  time,  while  the  new 
contextual  model  matched  preferences  95%  of  the  time,  reducing  the  error  rate  by  75%. 

3  METHODS,  ASSUMPTIONS  AND  PROCEDURES 

Our  approach  was  iterative,  engaging  end-users  through  user  experience  research 
methodologies  (using  observational  studies,  interviews,  and  surveys)  and  evolving  our 
solution  through  system  development  (building  user  interfaces,  the  modified  android 
operating  system,  and  a  classifier).  Each  of  the  iterative  cycles  of  our  project  was 
divided  into  stages,  with  results  from  each  stage  feeding  into  both  the  research 
directions  and  technical  goals  for  the  next  stage.  Stages  of  our  iterative  research 
process,  along  with  the  methods  employed  at  each  stage  are  described  below. 

Stage  1  Understand  user  needs  and  requirements  gathering 
Interviews  of  Stakeholders  and  Extreme  Users  Develop  Initial 
Requirements  from  Privacy  Concerns  Development  and  test  Lo-Fi 
prototypes 

Stage  2  Prototyping  and  Testing 

Develop  and  survey  of  Hi-Fi  Prototypes  of  user  control  dashboard 
Implementation  privacy  controls 

Interviews  of  Stakeholders  and  Extreme  Users  Refinement  of 
Classifier 

Stage  3  Implementation  and  Field  Testing 

Implementation  and  refinement  of  the  classifier 
Implementation  and  refinement  of  the  Android  OS 
Deployment  and  field  test  of  devices  Interviews  of  field  test  users 

Below  we  describe  the  details  of  the  methods  we  employed. 

3.1  User  Interviews 
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We  conducted  a  series  of  interviews  of  15  users  who  had  mobile  devices  and 
experiences  with  either  BYOD  currently  or  in  the  past  had  both  a  personal  and  work 
phone.  For  the  initial  interviews  we  focused  on  Extreme  Users',  users  who  represented 
more  sophistication  and  sensitivity  to  privacy  concerns  than  the  average  consumer. 
Focusing  on  Extreme  Users  is  a  methodology  that  is  employed  in  User  Experience 
research,  particularly  when  exploring  issues  such  as  privacy  that  are  not  typically  on 
most  users  mind.  The  Stanford  Design  School  (D.SchooE)  describes  research  on 
Extreme  Users  as  follows  [15]: 

Designers  engage  with  users  (people!)  to  understand  their  needs  and  gain  insights  about 
their  lives.  They  also  draw  inspiration  from  their  work-arounds  and  frameworks.  When 
you  speak  with  and  observe  extreme  users,  their  needs  are  amplified  and  their  work¬ 
arounds  are  often  more  notable.  This  helps  you  pull  out  meaningful  needs  that  may  not 
pop  when  engaging  with  the  middle  of  the  bell  curve.  However,  the  needs  that  are 
uncovered  through  extreme  users  are  often  also  needs  of  a  wider  population. 

Using  Extreme  Users  would  allow  us  to  uncover  issues  and  concerns  that  would  likely 
overlap  with  end-user  concerns  of  the  larger  population,  but  would  be  buried  or  difficult 
to  uncover  in  the  course  of  interviewing  laymen.  Eor  our  methodology,  we  used  Extreme 
Users  to  uncover  the  issues  that  were  most  important;  we  used  that  knowledge,  along 
with  our  own  experience  and  previous  work,  to  motivate  our  system  implementation; 
and  finally  we  went  back  to  the  larger  population  to  verify  and  confirm  our  insights. 

The  Extreme  Users  that  we  recruited  were  from  various  government  agencies 
(California  Department  of  Justice,  Eos  Angeles  District  Attorney,  etc.)  and  in  roles  as 
either  prosecutors  (civil  and  criminal),  investigators  or  law  enforcement  where  the 
separation  of  private  and  personal  information  was  very  important  to  their  jobs  and  in 
some  cases  personal  safety  and  safety  of  the  individuals  that  they  worked  with  was  of 
great  importance. 

Each  interview  we  conducted  was  either  over  the  phone  or  in-person,  and  lasted 
between  1-1.5  hours  total.  The  initial  interviews  were  semi-structured,  with  a  script  that 
the  moderator  walked  through  with  the  users,  with  time  for  unscripted  questions  at  the 
end.  These  initial  interviews  focused  on  their  current  practices  and  concerns,  and 
explored  ways  in  which  these  concerns  could  be  addressed  by  technology. 

The  next  iteration  of  interviews  also  focused  on  concerns,  but  for  the  later  portion  of  the 
interview  they  were  shown  a  working  prototype  that  incorporated  some  of  the  earlier 
suggestions  (e.g.,  location  and  address  book  controls),  as  well  their  understanding  and 
opinions  on  our  user  interface  (the  privacy  “dashboard”). 

The  last  round  of  interviews  we  performed  on  subjects  who  were  not  Extreme  Users, 
but  from  a  larger  population  from  our  field  study.  We  interviewed  users  initially  about 
their  phone  use,  and  again  at  the  end  of  our  field  study  about  their  experiences  with  our 
devices  and  using  the  controls. 
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3.2  Initial  Classifier  Design 

Our  goal  was  to  collect  a  feature  set  that  would  allow  the  platform  to  automatically  detect 
the  context  surrounding  each  permission  request,  so  that  once  trained,  the  platform  can 
use  these  features  to  infer  whether  a  permission  request  is  likely  to  be  deemed 
appropriate  by  the  user.  This  entails  determining  the  factors  that  define  a  context,  as  well 
as  the  factors  that  are  indicative  of  users’  privacy  preferences.  Defining  context  is 
important  because  it  scopes  privacy  decisions:  whenever  the  context  in  which  an 
application  requests  access  to  a  particular  data  type  changes,  the  system  needs  to  make 
a  decision  about  whether  or  not  to  grant  access.  Similarly,  in  order  to  make  these 
decisions  automatically  without  prompting  the  user,  this  entails  determining  the  factors 
that  predict  users’  privacy  preferences. 

Our  main  objective  was  to  explore  the  learnability  of  user  privacy  preferences  over 
prompting  for  every  sensitive  permission  request;  an  effectively-trained  permission 
system  should  be  able  make  decisions  on  behalf  of  the  users  accurately  with  minimal 
user  involvement.  While  prompting  for  every  sensitive  permission  request  gives  the 
user  a  finer  degree  of  control  over  their  privacy  protection,  we  showed  that  it  is 
impractical  due  to  the  sheer  volume  of  requests  that  would  result  [18];  we  believed  the 
best  way  forward  was  to  learn  a  users’  privacy  preferences.  Learning  involves  three 
essential  components: 

•  The  platform  needs  to  figure  out  which  permission  requests  are 
likely  to  defy  user  expectations. 

•  The  platform  also  needs  to  figure  out  which  permission  types  are 
more  sensitive  on  a  per-user basis. 

•  The  platform  needs  to  learn  what  other  observable  factors  are  used  in 
users’  decision  processes  and  how  they  can  be  used  to  infer  users’ 
decision-making. 

We  used  the  Experience  Sampling  Method  (ESM)  to  collect  ground  truth  data  about 
users’  privacy  preferences  [11].  ESM  involves  repeatedly  questioning  participants  in 
situ  about  a  recently  observed  event;  in  this  case,  we  probabilistically  asked  them  about 
an  application’s  recent  access  to  data  on  their  phone,  and  whether  they  would  have 
permitted  it  if  given  the  choice.  We  treated  participants’  responses  to  these  ESM  probes 
as  our  main  dependent  variable  (Figure  1).  We  instrumented  the  Android  platform  so 
that  these  prompts  would  periodically  appear — no  more  than  once  per  day — after 
recent  permissions  requests  (Table  1). 
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Resource  Access  Notification 

Google  Play  services  has  accessed 
your  location. 

Given  a  choice,  would  you  have 
allowed  or  denied  this  access? 

DENY  ALLOW 


.'J)  1 


Figure  1:  A  screenshot  of  an  ESM  prompt. 


Table  1:  Participants  in  the  first  study  received  prompts  when  applications  at¬ 
tempted  to  use  these  permissions 


Permission  Type 

Activity 

ACCESS  WIFI  STATE 

View  nearby  SSIDs 

NFC 

Communicate  via  NFC 

READ  HISTORY  BOOKMARKS 

Read  users’  browser  history 

ACCESS  FINE  LOCATION 

Read  GPS  location 

ACCESS  COARSE  LOCATION 

Read  network-inferred  location 
(i.e.,  cell  tower  and/or  WiFi) 

LOCATION  HARDWARE 

Directly  access  GPS  data 

READ  CALL  LOG 

Read  call  history 

ADD  VOICEMAIL 

Read  call  history 

READ  SMS 

Read  sent/received/draft  SMS 

SEND  SMS 

Send  SMS 

*INTERNET 

Access  Internet  when  roaming 

*WRITE  SYNC  SETTINGS 

Change  application  sync 
settings  when  roaming 

We  also  instrumented  participants’  smartphones  to  obtain  data  about  their  privacy- 
related  behaviors  and  the  frequency  with  which  applications  accessed  protected 
resources.  The  instrumentation  required  a  set  of  modifications  to  the  Android  operating 


Approved  for  Public  Release;  Distribution  Unlimited 
7 


system  and  flashing  a  custom  Android  version  onto  participants’  devices. 

Table  2  contains  the  complete  list  of  behavioral  and  runtime  events  our  instrumentation 
recorded.  The  behavioral  data  fell  under  several  categories,  all  chosen  based  on  several 
hypotheses  that  we  had  about  the  types  of  behaviors  that  might  correlate  with  privacy 
preferences:  web-browsing  habits,  screen  locking  behavior,  third-party  application 
usage  behavior,  audio  preferences,  call  habits,  camera  usage  patterns,  and  behavior 
related  to  security  settings.  For  example,  we  hypothesized  that  someone  who  manually 
locks  their  device  screen  are  more  privacy-conscious  than  someone  who  lets  it  time 
out. 


Table  2:  Instrumented  events  that  form  our  feature  set 


Type 

Event  Recorded 

Behavioral 

Instrumentation 

Changing  developer  options 

Opening/Closing  security  settings 

Changing  security  settings 

Enahling/Disabling  NFC 

Changing  location  mode 

Opening/Closing  location  settings 

Changing  screen-lock  type 

Use  of  two  factor  authentication 

Log  initial  settings  information 

User  locks  the  screen 

Screen  times  out 

App  locks  the  screen 

Audio  mode  changed 

Enahling/Disahling  speakerphone 

Connecting/Disconnecting  headphones 

Muting  the  phone 

Taking  an  audio  call 

Taking  a  picture  (front-  vs.  rear-facing) 

Visiting  an  HTTPS  link  in  Chrome 

Responding  to  a  notification 

Unlocking  the  phone 

Runtime 

Information 

An  application  changing  the  visibility 

Platform  switches  to  a  new  activity 

Permission 

Requests 

An  app  requests  a  sensitive  permission 

ESM  prompt  for  a  selected  permission 

The  primary  purpose  of  recording  user  behaviors  was  to  observe  how  much  time  a  user 
voluntarily  spends  on  making  security  and  privacy  related  decisions  by  changing 
default  settings  or  re-visiting  decisions  they  have  made  earlier  (security  settings, 
location  settings),  because  observing  these  behaviors  could  be  indicative  of  their  privacy 
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preferences.  For  instance,  the  choices  they  had  made  about  screen  locks;  how  careful 
they  are  with  their  web  browsing  habits,  such  as  how  often  do  they  use  Chrome 
incognito  tabs,  how  often  they  get  warnings  from  visiting  suspicious  websites,  and  so 
forth.  We  also  collected  other  observable  traits  that  could  be  indicative  of  privacy 
preferences,  such  as  how  often  they  take  pictures,  their  audio  preferences,  and  how 
active  they  are  with  audio  calls  using  the  phone. 

Finally,  we  collected  runtime  information  about  the  context  of  each  permission  request, 
including  the  visibility  of  the  requesting  application  at  the  time  of  request  (i.e.,  whether 
it  was  in  the  foreground  or  background),  what  the  user  was  doing  when  the  request  was 
made  (i.e.,  the  name  of  the  foreground  application),  and  the  exact  Android  API  function 
invoked  by  the  application  to  determine  what  information  was  requested.  The  visibility 
of  an  application  reflects  the  extent  to  which  the  user  was  likely  aware  that  the 
application  was  running;  if  the  application  was  in  the  foreground,  the  user  had  cues  that 
the  application  was  running,  but  if  it  was  in  the  background,  then  the  user  was  likely 
not  aware  that  the  application  was  running  and  therefore  might  find  the  permission 
request  unexpected — some  background  services  can  still  be  visible  to  the  user  due  to 
on-screen  notification  or  other  cues  that  could  be  perceptible.  We  monitored  processes’ 
memory  priority  levels  to  determine  the  visibility  of  all  Android  processes.  We  also 
collected  information  about  which  Android  Activity  was  active  in  the  application, 
which  indicates  the  UI  elements  exposed  to  the  user. 

Further  details  on  the  results  and  methodology  are  described  in  the  Appendix  and  in 
[19]. 

3.3  Prototype  and  User  Testing 

We  tested  various  stages  of  the  user  interface  and  the  notification  interaction  with  the 
classifier,  as  part  of  our  methodology.  We  performed  remote  user  testing  of  the 
classifier  with  a  walk-through,  talk  aloud  protocol  on  a  small  set  of  10  initial  users,  and 
then  performed  a  large  scale  survey  and  test  of  the  user  interface  on  400,  and 
subsequently  580  users.  A  detailed  description  of  our  test  process  and  results  are 
included  in  the  Appendix  and  in  the  paper  [17]. 

3.4  System  Implementation 

We  implemented  a  complete  ML  pipeline  that  includes:  mechanisms  to  allow  users  to 
review  and  redress  their  decisions  based  on  the  results  of  our  UI  proto  typing  study 
[17];  ways  to  mask  resource  denial  from  apps  so  that  apps  continue  to  run,  even  when 
permissions  are  denied  (unless  those  permissions  were  critical  to  their  functionality); 
and  finally,  a  classifier  that  takes  surrounding  contextual  signals  to  predict  user 
preferences  for  each  permission  request.  This  means  our  usability  study  is  a  more 
accurate  assessment  of  how  the  system  behaves  in  the  wild  than  the  previous 
investigations,  which  relied  on  user  expectations  rather  than  consequential  privacy 
decisions.  This  final  study  validates  that  prior  work. 


Approved  for  Public  Release;  Distribution  Unlimited 
9 


3.4.1  A  Local  SVM  Classifier 


We  previously  implemented  an  offline  classifier  at  the  onset  of  this  project  and 
suggested  this  could  be  deployed  as  a  remote  web-accessible  service  in  order  to  shift 
compute  costs  from  the  mobile  device  to  a  more  powerful  dedicated  server  [  19] .  We  note, 
however,  that  this  design  required  sending  privacy-relevant  data  beyond  the 
smartphone,  which  creates  a  larger  attack  surface  and  increases  system  costs  and 
complexity.  It  also  creates  significant  security  risks  if  the  server  responsible  for  making 
decisions  is  compromised  or  is  trained  with  spurious  data. 

To  mitigate  these  security  and  privacy  issues,  we  implemented  and  integrated  the  full 
SVM  classifier  into  the  Android  operating  system  as  a  system  service.  We  ported  the 
open-source  implementation  of  libsvm  to  Android  6.0.1  (Marshmallow)  [4],  and  built 
two  additional  system-level  services  to  interface  with  the  SVM:  the  SVMTrainManager, 
which  trains  the  model  using  user-provided  privacypreferences  through  prompts  (See 
Figure  2);  and  the  PermissionService,  which  uses  the  SVM  to  regulate  applications 
accessing  permission-protected  resources  and  issues  a  prompt  for  the  user  for  cases  when 
the  model  produces  low-confidence  predictions.  The  SVMTrainManager  notifies  the 
PermissionService  when  the  model  is  trained  and  ready  for  use.  These  two  new  services 
are  implemented  into  the  core  Android  operating  system,  and  neither  are  accessible  to 
third-party  apps.  On  average,  model  training  takes  less  than  5  seconds.  We  instrumented 
all  Android  control  flows  responsible  for  sharing  sensitive  permission-protected  data 
types  to  pass  through  this  new  pipeline. 


Figure  2:  A  screenshot  of  a  permission  request  prompt. 
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3.4.2  Bootstrapping 

We  deployed  our  trainable  permission  system  along  with  a  generic  model  that  was  pre¬ 
trained  with  the  real-world  permission  decisions  of  131  users,  shared  with  us  from  our 
initial  work  [19].  This  ensured  that  a  new  user  has  an  initial  model  for  making  privacy 
decisions.  This  initial  model,  however,  is  inadequate  for  accurately  predicting  any 
particular  individual  user’s  preferences,  because  it  simply  has  no  knowledge  of  that 
particular  user.  Despite  that,  we  previously  showed  that  our  model  only  needs  12 
additional  user-provided  permission  decisions  before  the  model  attains  peak  accuracy. 
Given  this,  our  implemented  system  requires  that  the  user  make  12  decisions  early  on 
to  train  the  initial  model  to  that  particular  user’s  preferences. 

The  initial  12  decisions  are  selected  based  on  weighted  reservoir  sampling.  We  weigh 
the  combination  of  application:permission:visibility  by  the  frequency  that  these  are 
observed;  the  most-frequent  combinations  are  the  likeliest  to  produce  a  permission 
request  prompt  (Figure  1).  The  intuition  behind  this  strategy  is  to  focus  more  on  the 
frequently  occurring  permission  requests  over  rarer  ones.  We  used  these  same  prompts 
for  validating  our  classifier  during  the  field  study. 

3.4.3  Feature  Set 

Our  model  considers  the  name  of  the  application  requesting  the  permission,  the 
application  in  the  foreground  at  the  time  of  the  request  (if  different  than  the  application 
making  the  request),  the  requested  permission  type  (e.g..  Location,  Camera,  Contacts), 
and  the  visibility  of  the  application  making  the  request.  In  a  pilot  study,  our  system 
implemented  the  full  feature  set  we  previously  used  [19]. 

This  design,  however,  resulted  in  a  noticeable  reduction  in  device  responsiveness  as 
reported  by  multiple  study  participants.  We  subsequently  removed  the  “time  of  request” 
feature  for  the  second  phase  of  our  study.  The  removal  of  the  time  feature  from  the  ML 
enabled  the  platform  to  cache  higher  number  of  ML  decisions  saving  reducing  the 
overhead  stemming  from  running  the  ML  for  each  different  permission  request. 

3.4.4  Sensitive  Resources 

Previous  work  by  Felt  et  al.  argued  that  certain  permissions  should  be  presented  as 
runtime  prompts,  as  those  permissions  guard  sensitive  resources  whose  use  cases 
typically  impart  contextual  cues  indicating  why  an  app  would  need  that  resource  [8]. 
Beginning  with  Android  6.0  (Marshmallow),  the  OS  designated  certain  permissions  as 
“dangerous”  [10],  and  prompts  the  user  to  grant  or  deny  permission  when  an  app  tries 
to  use  it  for  the  first  time.  The  user’s  response  to  this  prompt  then  carries  forward  to  all 
future  uses  of  that  resource  by  the  requesting  application. 

Our  experimental  permission  system  uses  both  Felt’s  set  of  recommended  permissions 
for  runtime  prompts  and  Android’ s  own  “dangerous”  ones.  We  did,  however,  opt  to  omit 


'  “application”  is  the  app  requesting  the  permission,  “permission”  is  the  requested  resource  type,  and 
“visibility”  denotes  whether  the  user  is  made  aware  that  the  app  is  running  on  the  device. 
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a  few  permissions  from  the  resulting  set  that  we  viewed  as  irrelevant  to  most  users.  The 
INTERNET  and  WRITE  SYNC  SETTINGS  permissions  were  discounted,  as  we  did  not 
expect  any  participant  (all  recruited  locally)  to  roam  internationally  during  the  1-week 
study  period.  We  eliminated  the  NEC  permission  because  previous  work  demonstrated 
that  very  few  apps  operate  on  NFC  tags.  Our  system  ignores  the  read  history 
BOOKMARKS  permission,  as  this  is  no  longer  supported. 

We  extended  our  initial  framework  [18, 19]  to  monitor  and  regulate  all  attempts  by  apps 
to  resources  protected  by  any  of  the  24  permissions  we  monitored.  We  avoid  false 
positives  by  monitoring  both  the  requested  permission  and  the  returneddata  type. 

3.4.5  Permission  Denial 

Making  changes  to  the  permission  system  carries  the  risk  of  app  instability,  as  apps  may 
not  expect  to  have  their  resource  requests  denied  [7].  If  denying  permissions  results  in 
frequent  crashes,  then  users  may  become  more  permissive  simply  to  improve  app 
stability.  We  therefore  designed  our  implementation  with  this  concern  in  mind:  rather 
than  simply  withholding  sensitive  information  in  the  event  of  a  denied  permission,  our 
system  supplies  apps  with  well-formed  but  otherwise  non-private  “spoofed”  data.  This 
enables  apps  to  continue  functioning  unless  access  to  the  permission-protected  resource 
is  critical  to  the  app’s  correct  behavior. 

For  example,  if  an  app  requests  access  to  the  microphone,  but  our  permission  system 
denies  it,  the  app  will  still  receive  a  valid  audio  stream:  not  an  actual  signal  from  the 
microphone,  but  that  of  a  pre-recorded  generic  sound.  (In  our  implementation  we  used 
a  loop  of  a  whale  song).  This  design  allows  apps  to  operate  on  valid  data  while  still 
preserving  user  privacy. 

Permission-protected  databases  (e.g.,  contact  lists  and  calendars)  require  finer-  grained 
regulation  under  our  permission  system.  For  instance,  an  app  may  have  a  legitimate 
need  to  access  the  contact  list.  Under  the  stock  Android  permission  system,  an  app  is 
either  able  to  read  all  contacts  or  no  contacts.  We  improve  upon  this  by  adding  a  notion 
of  provenance  to  each  entry:  every  contact  list  item  contains  a  field  that  records  the  app 
that  created  the  entry.  If  our  permission  system  denies  an  app  access  to  the  contact  list, 
the  app  is  still  able  to  write  into  the  contacts  database  and  read  back  any  entries  that  it 
previously  created.  Apps  without  these  database  permissions  are  effectively  placed  in 
a  sandbox,  in  which  they  can  still  carry  out  valid  operations  on  their  own  versions  of 
the  data.  They  neither  produce  an  exception  nor  obtain  all  the  information  in  the 
database.  We  allow  full  access  to  the  databases  only  to  apps  that  are  granted  the 
appropriate  permission. 

3.4.6  Contextually  Aware  Permission  Manager 

We  recognize  that  our  classifier  is  bound  to  make  mistakes.  Therefore,  it  is  crucial  to 
provide  a  mechanism  for  users  to  review  and  amend  decisions  made  by  the  permission 
model  on  their  behalf.  Mobile  operating  systems  have  configuration  panels  to  manage 
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app  permissions,  but  these  fail  to  provide  users  key  information  or  options  to  make 
informed  decisions.  However,  our  study  that  examined  user  interface  prototypes  [17] 
proposed  a  new  interface  to  solve  this  problem  [17].  In  that  study  (still  a  part  of  this 
project),  we  evaluated  several  designs  using  interactive  online  mock-ups  and  found  that 
the  design  significantly  improved  user  experience  over  the  stock  configuration  panel. 
We  followed  those  recommendations  in  the  design  that  we  built  as  part  of  this 
implementation. 

We  built  our  contextual  permission  manager  as  a  system-space  app,  similar  to 
Android’s  Settings  app  (Figure  3).  Our  permission  manager  has  three  main  objectives: 
(i)  to  display  all  recent  permission  requests  and  the  corresponding  “allow”  or  “deny” 
decisions  from  the  ML  model;  (ii)  to  allow  users  to  review  and  change  app  permissions; 
and  (iii)  to  display  all  the  resources  an  app  can  access. 
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Figure  3:  The  recent-allowed  app  activity  (left),  a  list  of  installed  apps  and  their 
associated  permissions  (center).  Permissions  can  he  always  granted,  granted  only 

when  in  use,  or  never  granted  (right). 


When  users  set  preferences  (rules)  in  the  permission  manager,  before  making  a  ML 
decision,  platform  checks  to  see  if  the  user  has  set  any  rules  for  the  current  request;  if  a 
match  is  found,  rather  than  going  to  the  ML,  platform  will  use  the  current  rule  to  respond 
to  the  permission  request  accordingly.  The  system  does  not  use  these  user-set  rules  to 
train  the  ML  model,  it  is  hard  to  capture  the  contextuality  behind  these  changes  so  the 
platform  cannot  create  any  of  the  contextual  features  to  train  the  ML. 

3.5  Validation  Study  Methodology 

We  tested  our  implementation  by  performing  a  field  study  with  37  participants.  Our 
goals  were  to  understand  how  third-party  apps  and  end-users  react  to  a  more  restrictive 
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and  selective  permission  model,  as  compared  to  the  default  AOFU  model. 

For  a  period  of  one  week,  each  participant  used  a  smartphone  (Nexus  5X)  running  a 
custom  version  of  the  Android  OS  (a  variation  of  Android  6.0.1)  built  with  the  new 
permission  system  detailed  in  the  previous  section.  During  the  study  period,  all  of  a 
participant’s  sensitive  data  was  protected  by  the  new  contextually-aware  permission 
model. 

3.5.1  Participant’s  Privacy  Preferences 

We  used  the  Experience  Sampling  Method  (ESM)  to  understand  howparticipants  want 
to  control  certain  sensitive  resource  accesses  [11],  similar  to  how  we  used  it  in  our 
earlier  work  [19].  ESM  involves  repeatedly  questioning  participants  in  situ  about  a 
recently  observed  event;  in  our  case,  the  event  is  an  app  requesting  access  to  a  sensitive 
resource.  We  probabilistically  asked  them  about  an  application’s  recent  request  to 
access  to  data  on  their  phone,  and  how  they  want  to  control  future  similar  requests 
(Eigure  1).  We  treated  participants’  responses  to  these  ESM  prompts  as  our  main 
dependent  variable,  which  we  used  to  validate  the  accuracy  of  the  decisions  that  the 
classifier  was  automaticallymaking. 

Each  participant  during  the  study  period  responded  to  4  prompts  per  day,  and  at  most 
one  per  hour.  The  prompting  was  divided  into  two  phases.  The  first  phase  was  the 
bootstrapping  phase,  which  we  described  earlier,  to  train  the  classifier.  The  second 
phase  was  the  validation  phase,  which  was  used  to  measure  the  accuracy  of  the  ME 
model.  In  addition  to  the  validation  phase  prompts,  participants  might  also  have 
occasional  prompts  for  low-confidence  decisions  made  by  the  ME;  a  detailed 
discussion  on  low-confidence  decisions  is  provided  later.  During  our  study  period,  only 
4  participants  ever  experienced  low -confidence  prompts. 

3.5.2  Recruitment 

We  recruited  participants  in  two  phases:  a  pilot  in  May  2017  and  the  full  study  in 
August  2017.  We  placed  a  recruitment  ad  on  Craigslist  under  “et  cetera  jobs”  and 
“domestic  gigs.”^The  title  of  the  advertisement  was  “Smartphone  Research  Study,”  and 
it  stated  that  the  study  was  about  how  people  interact  with  their  smartphones.  We  made 
no  mention  of  security  or  privacy.  Interested  participants  downloaded  a  screening  app 
from  the  Google  Play  store,  which  asked  for  demographic  information  and  collected 
their  smartphone  make  and  model.  We  screened  out  applicants  who  were  under  1 8  years 
of  age  or  used  CDMA  providers,  since  our  experimental  phones  were  only  GSM- 
compatible.  We  collected  data  on  participants’  installed  apps,  so  that  we  could  pre¬ 
install  free  apps  prior  to  them  visiting  our  laboratory.  (We  only  encountered  paid  apps 
for  a  few  participants,  and  those  apps  were  installed  once  we  setup  their  Google  account 
on  the  test  phone.) 

We  scheduled  a  time  with  participants  who  met  the  screening  requirements  to  do  the 

^  The  study  was  approved  by  our  Institutional  Review  Board. 
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initial  setup.  Overall,  63  people  showed  up  to  our  laboratory,  and  of  those,  61  qualified 
(2  were  rejected  because  our  screening  application  did  not  identify  some  CDMA 
carriers).  The  initial  setup  took  roughly  30  minutes  and  involved  transferring  their  SIM 
cards,  helping  them  set  up  their  Google  and  other  accounts,  and  making  sure  they  had 
all  the  applications  they  used.  We  compensated  each  participant  with  a  $35  gift  card  for 
showing  up. 

During  the  pilot  phase,  out  of  20  people  who  were  given  phones,  14  participants  had 
technical  issues  with  the  phone  preventing  them  from  using  it,  leaving  only  6 
participants  with  usable  data.  During  the  main  phase,  out  of  42  people  who  were  given 
phones,  we  had  the  following  issues: 

•  4  participants  misinterpreted  our  ESM  prompts  so  we  filtered  out 
their  prompt  responses; 

•  5  participants  suffered  from  a  bug  in  the  code  that  inhibited  the 
validation  phase  of  the  ML; 

•  2  participants  performed  factory  resets  on  the  phone  before 
returning  it,  which  destroyed  stored  logs. 

This  left  31  participants  with  usable  data  from  the  main  phase.  We  combined  the  6 
participants  with  usable  data  from  the  first  phase  with  the  31  from  the  second  phase  to 
produce  our  sample  of  37  users,  since  we  did  not  alter  the  study  between  phases.  All 
our  results  are  drawn  from  log  data  and  interview  responses  from  those  37  users.  Of 
that  population,  21  were  female  and  16  were  male;  ages  ranged  from  18  to  59  years  old 
()U=  34.25,  er=9.92). 

After  initial  setup,  participants  used  the  experimental  phones  for  one  week  in  lieu  of 
their  normal  phones.  They  were  allowed  to  install,  use,  and  uninstall  any  apps  that  they 
wanted.  Our  logging  framework  kept  track  of  every  protected  resource  accessed  by  an 
app,  along  with  the  contextual  data  surrounding  each  application  request.  All  the 
logged  events  were  stored  compressed  in  the  local  system. 

3.5.3  Exit  Interview 

When  participants  returned  to  our  laboratory,  we  first  copied  the  log  data  from  the 
phones  to  make  sure  that  they  had  actually  used  the  phone  during  the  study  period.  We 
then  administered  a  semi-structured  exit  interview,  which  had  four  components: 

•  New  Permission  Manager  UI — ^We  asked  participants  to  show  us 
how  they  would  use  the  UI  (Figure  3)  to  block  a  given  application  from 
accessing  background  location  data,  as  well  as  how  difficult  they  found 
it.  We  also  checked  our  data  to  see  how  they  interacted  with  the  UI 
during  the  study  period,  and  asked  them  about  the  circumstances  for 
those  interactions.  The  objective  of  this  task  was  to  validate  the  design 
objectives  of  the  UI,  including  whether  they  use  it  to  resolve  issues 
stemming  from  resourcedenial. 
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•  Permission  Prompts — ^We  asked  participants  questions  about 
permission  prompts  they  had  encountered  during  the  study.  We  asked 
why  they  allowed  or  denied  permission  requests  and  also  how  they  felt 
about  the  prompts.  We  asked  them  to  rate  their  experience  with  the 
prompts  across  3  different  categories:  levels  of  surprise,  feelings  of 
control,  and  to  what  extent  they  felt  the  new  system  had  increased 
transparency.  The  objective  of  this  section  was  to  understand  the  impact 
of  the  runtime  prompts. 

•  Permission  Models — ^We  asked  participants  questions  about  their 
perspectives  on  the  privacy  protections  in  Android.  We  asked  how  much 
they  understood  the  current  system.  We  then  explained  our  new  system, 
and  asked  how  they  felt  about  letting  ML  act  on  their  behalf.  The 
objective  of  this  section  was  to  understand  how  much  participants 
actually  understood  the  new  permission  model. 

•  Privacy  Concerns — Finally,  we  asked  participants  how  they 
usually  make  privacy  decisions  on  their  mobile  devices,  how  serious 
they  are  about  privacy,  and  how  much  are  they  willing  to  pay  for  privacy. 

We  also  asked  demographic  questions. 

Three  researchers  independently  coded  144  responses  to  the  Permission  Prompts  and 
Permission  Model  questions  (the  other  questions  involved  either  direct  observations  or 
reporting  participants  ’  responses  verbatim  without  the  need  for  coding) .  Prior  to  meeting 
to  achieve  consensus,  the  three  coders  disagreed  on  17  responses,  which  resulted  in  an 
inter-rater  agreement  of  86.43%  and  Fleiss’  kappa  yielded  0.747,  indicating  substantial 
agreement. 

After  the  exit  survey,  we  answered  any  remaining  questions,  and  then  assisted  them  in 
transferring  their  SIM  cards  back  into  their  personal  phones.  Finally,  we  compensated 
each  participant  with  a  $100  gift  card. 

4  RESULTS  AND  DISCUSSION 

At  the  end  of  the  study  period,  we  collected  1,159  privacy  decisions  (prompt  responses) 
from  37  participants.  A  total  of  133  unique  applications  caused  prompts  for  17  different 
sensitive  permission  types.  During  the  study  period,  24.23%  of  all  runtime  prompts 
were  denied  by  participants.  Most  (66%)  of  these  prompts  occurred  when  the 
requesting  application  was  running  visibly.  Our  instrumentation  logged  5.4M  sensitive 
permission  requests  originating  from  253  unique  applications  for  17  different 
permission  types.  On  average,  a  sensitive  permission  request  occurred  once  every  4 
seconds. 

In  the  remainder  of  the  paper,  we  describe  the  shortcomings  of  the  existing  ask-on- 
first-use  permission  model,  both  in  accuracy  and  in  aligning  with  users’  expectations; 
we  show  how  our  proposed  system  has  vastly  greater  accuracy  in  inferring  users’  privacy 
preferences  and  applying  them  towards  regulating  application  permissions;  and  we 
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show  that  is  does  this  with  minimal  impact  on  app  functionality.  Finally,  we  present 
results  from  the  exit  interviews  regarding  participants’  perceptions  about  the  training 
prompts  and  the  privacy  management  user  interface. 

4.1  Status  Quo  Problems 

In  the  “ask-on-first-use”  (AOFU)  model,  the  user  is  prompted  only  the  first  time  an  app 
attempts  to  access  a  protected  resource.  Requesting  these  permissions  at  runtime  allows 
the  user  to  infer  the  potential  reason  for  the  request,  based  on  what  they  were  doing  at 
the  time  (i.e.,  context).  AOFU’s  shortcoming,  however,  is  that  it  naively  reapplies  the 
user’s  first-use  decision  in  subsequent  scenarios,  without  adapting  to  different  contexts. 
Our  previous  work  showed  that  failing  to  account  for  changing  contexts  produces  high 
error  rates  (i.e.,  the  user  would  have  opted  to  deny  permission  if  AOFU  had  not  granted 
it  based  on  the  first-use  prompt)  [18]. 

We  note  that  our  initial  work  on  this  project  measured  the  accuracy  of  the  AOFU  model 
by  merely  collecting  users’  responses  to  runtime  permission  prompts,  without  actually 
enforcing  them  by  denying  apps  access  to  data  [19].  Thus,  the  accuracy  rates  reported 
by  that  study  may  not  actually  be  valid,  since  users  may  elect  to  change  their 
permission-granting  preferences,  if  they  result  in  a  loss  of  application  functionality. 
Thus,  we  evaluated  the  performance  of  the  AOFU  approach  (in  current  use  by  Android 
and  iOS)  by  presenting  participants  with  permission  prompts  that  actually  resulted  in 
the  denial  of  application  permissions. 

During  the  study  period,  each  participant  responded  to  combinations  of 
application-.permission  more  than  once.  As  AOFU  is  deterministic,  we  can  simulate  it 
by  comparing  a  user’s  first  response  to  an  application: permission  combination  to  future 
responses  to  the  prompts  for  the  same  app  and  permission.  We  use  this  data  to  measure 
how  often  AOFU  matches  the  user’s  preference  in  subsequent  requests. 

Our  data  show  that  the  AOFU  permission  model  has  a  median  error  rate^  of  20%:  in 
more  than  one-fifth  of  app  requests  for  permission-protected  resources,  participants 
changed  their  initial  response  for  the  same  application-.permission  combination.  Of  the 
37  participants,  64%  had  at  least  one  such  discrepancy  between  the  first-use  and 
subsequent  preferences.  This  refutes  AOFU’s  core  assumption  that  only  few  users  will 
deviate  from  their  initial  preferences  in  future  cases.  This  observation  corroborates  the 
initial  study  [19],  in  which  79%  of  131  participants  were  shown  to  deviate  from  their 
initial  responses  in  subsequent  cases. 

The  errors  shown  in  AOFU,  could  be  either  privacy  violations  or  losses  of  functionality. 
A  privacy  violation  occurs  when  the  system  grants  an  app  access  to  a  protected 
resource,  contrary  to  the  user’s  preference,  had  she  been  prompted. 

Loss  of  functionality  occurs  when  the  permission  system  denies  access  to  a  protected 


^  We  report  medians  because  the  error  rate  was  not  normally  distributed  among  participants. 
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resource,  which  the  user  would  have  otherwise  permitted.  We  consider  privacy 
violations  to  be  the  more  severe  type  of  error,  as  the  user  is  unable  to  take  back  sensitive 
information  once  an  app  has  acquired  it  and  transmitted  it  to  a  remote  server.  However, 
loss  of  functionality  is  still  undesirable  because  those  errors  might  incentivize  the  user 
to  be  overly  permissive  in  order  to  regain  that  functionality.  From  our  data,  we  found 
that  66.67%  of  AOFU  errors  wereprivacy  violations;  the  remaining  33.33%  were  losses 
in  functionality. 

4.1.1  AOFU  User  Expectations 

Errors  in  permission  systems  could  arise  from  a  variety  of  reasons.  Mismatched  user 
expectations  and  lack  of  comprehension  are  two  critical  ones,  which  could  hamper  any 
permission  model’s  utility.  User  comprehension  is  critical  because  users  may  make 
suboptimal  decisions  when  they  do  not  fully  understand  permission  prompts,  hindering 
the  ability  of  the  permission  system  to  protect  sensitive  system  resources.  Users  must 
be  able  to  comprehend  the  decision  they  are  making  and  the  consequences  of  their 
choices.  Recent  work  on  AOFU  has  examined  the  motives  behind  users’  decisions  and 
how  it  varies  between  categories  of  applications,  as  well  as  how  people  adapt  their 
behavior  to  the  new  model  [3,  2,  1]. 

In  our  study,  the  participants  had,  on  average,  5  years  of  experience  with  Android.  This 
indicates  that  most  of  our  participants  have  experienced  both  install-time 
permissions — the  permission  model  prior  to  Android  6.0,  released  in  2015 — and 
runtime  “ask-on-first-use”  permission  prompts.  The  majority  of  participants  said  they 
noticed  the  shift  to  AOFU  prompts,  and  they  were  aware  that  these  prompts  are  a  way 
to  ask  the  user  for  consent  to  share  data  with  an  app.  A  large  minority  of  participants 
(40%),  however,  had -an  inadequate  understanding  of  how  AOFU  works,  which  could 
substantially  hinder  that  permission  model’s  effectiveness  in  protecting  user  data. 

Four  out  of  the  37  participants  expressed  doubts  about  the  rationale  behind  the  prompts. 
Rather  than  seeing  permission  prompts  as  a  way  for  users  to  regulate  access  to  their 
sensitive  data,  these  participants  viewed  these  prompts  as  a  mechanism  to  extract  more 
information  from  them: 

“When  I  see  prompts,  I  feel  like  they  want  to  know  something  about  me,  not  that 
they  want  to  protect  anything.  ”  (P21) 

One  possible  explanation  is  that  some  users  grew  accustomed  to  install-time  prompts, 
and  subsequently  perceived  the  change  to  runtime  prompts  as  a  new  way  for  Android 
to  collect  user  data.  Although  it  is  impractical  to  project  how  prevalent  this  sentiment 
is  in  the  general  population,  we  cannot  reject  its  existence.  Hence,  more  work  is  needed 
to  measure  its  impact  and  explore  the  potential  solutions. 

A  third  (3 1 .4%)  of  our  participants  were  not  aware  that  responding  to  an  AOFU  prompt 
results  in  a  blanket  approval  (or  denial)  that  carries  forward  to  all  the  app’s  future  uses 
of  the  requested  resource.  Most  participants  believed  that  responses  were  only  valid  for 
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a  certain  amount  of  time,  such  as  just  for  that  session  or  just  that  single  request.  This 
misconception  significantly  hinders  AOFU’s  ability  to  correctly  anticipate  the  user’s 
preferences  in  future  occurrences.  Again,  this  observation  raises  the  question  of 
whether  users  would  respond  differently  if  they  had  a  more  accurate  understanding  of 
how  AOFU works: 

“[I]  didn’t  know  that  granting  a  permission  carries  forward  in  the  future  until 
otherwise  changed.  [I]  expected  permissions  to  be  for  just  that  one  use.  ”  (P25) 

It  is  clear  that  granting  blanket  approval  to  sensitive  resources  is  not  what  users  expect 
all  the  time.  On  the  other  hand,  had  our  participants  been  asked  for  their  input  on  every 
permission  request,  they  would  have  received  a  prompt  once  every  4  seconds — 
involving  the  user  more  frequently  has  practical  limitations.  How,  then,  can  we  best 
project  users’  privacy  preferences  to  future  scenarios  without  overwhelming  them  with 
prompts? 

4.2  Classifier  Accuracy 

During  the  week-long  study  period,  each  participant  was  subject  to  two  operational 
phases  of  the  contextual  permission  system:  (a)  the  initial  learning  phase,  where 
participant  responses  to  prompts  were  used  to  re-train  the  SVM  classifier  according  to 
each  individual’s  preferences,  and  (b)  the  steady-state  validation  phase,  where 
responses  to  prompts  were  collected  to  measure  the  accuracy  of  the  classifier’s 
decisions. 

As  previously  discussed  in  our  section  on  bootstrapping,  we  use  weighted  reservoir 
sampling  during  the  learning  phase  to  prioritize  prompting  for  the  most  commonly 
observed  instances  of  application:permission:visibility  combinations.  During  the 
validation  phase,  participants  received  the  same  prompts  for  random  combinations  of 
features.  This  ensured  that  we  collected  validation  results  both  for  previously- 
encountered  and  new  combinations.  We  placed  a  maximum  limit  of  3  prompts  per 
combination  in  order  to  further  improve  prompt  diversity  and  coverage.  After 
presenting  participants  with  prompts,  the  instrumentation  recorded  the  response  and 
the  corresponding  decision  produced  by  the  classifier.  Using  participant  responses  to 
prompts  as  ground-truth,  we  measured  the  classifier’s  accuracy  during  the  validation 
phase.  From  our  sample  of  37  participants,  we  had  to  exclude  6  of  them  due  to  a  cache 
coherency  bug  that  was  discovered  after  the  pilot,  which  degraded  classifier 
performance.  For  the  remainder  of  this  section,  our  results  are  drawn  from  the 
remaining  sample  of  31,  unless  otherwise  noted. 

Taken  as  a  whole,  these  3 1  participants  responded  to  640  total  prompts  in  the  validation 
phase.  Our  contextual  permission  model  produced  a  median  accuracy  of  90%, 
compared  to  80%  under  AOFU  for  the  same  population.  The  classifier  reduced  AOFU’s 
error  rate  by  50%,  with  the  majority  of  classifier  errors  consisting  of  privacy  violations 
(i.e.,  access  granted  when  the  user  would  have  denied  it). 
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4.2.1  Offline  Learning 

We  were  curious  whether  the  accuracy  of  our  system  could  be  improved  through  the 
use  of  offline  learning,  which  would  require  much  more  computing  power.  Using 
participant  responses  to  permission  prompts,  we  analyzed  how  an  offline  SVM 
classifier  would  perform.  We  implemented  the  SVM  model  using  the  KSVM  module  in 
R.  We  performed  this  analysis  on  data  from  all  37  participants,  using  leave-one-out 
cross-validation  to  evaluate  how  the  offline  classifier  would  perform  for  each  participant. 

The  offline  model  had  a  median  accuracy  of  94.74%  across  the  37  participants.  By 
comparison,  AOFU  had  an  80%  accuracy  for  the  same  population.  This  represents  a 
75%  error  reduction  in  the  offline  contextual  model  compared  to  AOFU.  These 
numbers  corroborate  our  prior  findings  [19].  We  stress  the  significance  of  this 
corroboration,  because  the  results  hold  in  the  presence  of  actual  resource  denial,  which 
was  not  examined  in  the  prior  study.  This  suggests  that  users  will  continue  to  indicate 
their  true  preferences  in  response  to  prompts,  even  when  those  preferences  are 
enforced,  potentially  resulting  in  unanticipated  app  behavior. 

We  note  the  accuracy  difference  between  the  SVM  classifier  we  integrated  into 
Android  and  the  R  model  (90%  vs.  94.74%,  respectively).  This  is  due  to  how  the 
Android  SVM  implementation  performs  the  bootstrapping.  This  issue  is  not  inherent 
to  integrating  an  SVM  classifier  into  Android.  An  updated  implementation  has  the 
potential  to  reach  the  maximum  accuracy  observed  in  the  offline  model. 

4.2.2  Decision  Confidence 

In  our  initial  investigation  of  classifier-based  permission  models,  we  proposed  using 
decision  confidence  to  determine  for  which  application:  permission:visibility 
combinations  users  should  be  prompted  in  the  validation  phase  [19].  The  rate  of 
decision  confidence  is  also  a  measure  of  the  extent  to  which  the  classifier  has  learned 
the  user’s  preferences.  The  authors  suggested  that  if  this  rate  does  not  decrease  over 
time,  then  AOFU  will  likely  be  a  better  system  for  thoseusers. 

In  addition  to  the  prediction,  our  classifier  also  produced  a  class  probability,  which  we 
used  as  the  measure  of  decision  confidence.  The  classifier  produced  a  binary  result 
(i.e.,  allow  or  deny)  with  a  cutoff  point  of  0.5.  A  decision  probability  close  to  the  cutoff 
point  is  a  less  confident  result  than  one  far  from  it.  We  used  the  95%  confidence  interval 
as  a  threshold  to  determine  which  decisions  were  low-confidence  and  which  ones  were 
not. 

Only  4  of  our  field  study  participants  experienced  low-confidence  classifier  decisions 
that  caused  a  prompt  to  appear  after  the  bootstrapping  period.  Each  of  these  participants 
had  just  one  such  low-confidence  prompt  appear.  These  prompts  retrained  the 
classifier,  so  the  lack  of  any  subsequent  low-confidence  prompts  indicates  that  the 
classifier  produced  high-confidence  predictions  for  the  same 
application:permission:visibility  combination  in  future  cases. 
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The  lack  of  additional  training  prompts  also  suggests  that  users  are  less  likely  to 
become  habituated  to  prompting.  The  4  participants  who  each  received  one  additional 
prompt  saw  a  total  of  13  prompts  (including  the  12  prompts  during  the  training  phase). 
The  remaining  27  participants  saw  just  the  12  training  phase  prompts.  Had  our 
participants  been  subject  to  AOFU  instead  of  our  contextual  permission  system,  they 
would  have  received  a  median  of  15  prompts  each,  with  a  quarter  of  the  participants 
receiving  more  than  17.  Instead,  we  achieved  a  75%  error  reduction  (80%  vs.  94.74%) 
and  reduced  user  involvement  by  20%  (12  prompts  vs.  15)  through  the  use  of  classifier- 
driven  permissions,  compared  to  AOFU. 

4.3  Impact  on  App  Functionality 

Previous  research  has  shown  that  many  applications  do  not  properly  handle  cases  where 
they  are  denied  permission  to  access  a  protected  resource  [7] .  One  core  objective  of  our 
work  was  to  measure  how  apps  responded  to  a  stricter  permission  model  than  AOFU. 
For  example,  the  system  will  be  unusable  if  it  causes  erratic  application  behavior, 
through  the  use  of  dynamically  granted  permissions. 

In  the  field  study,  our  platform  instrumentation  recorded  each  application  crash  and  its 
corresponding  exception  message.  This  information  allowed  us  to  identify  the  possible 
root  cause  of  the  crash  and  whether  it  was  related  to  resourcedenial. 

We  observed  18  different  exceptions  classes,  such  as  Security  Except!  on, 
RuntimeException,  and  NullPointerException.  Eor  the  remainder  of  this 
section,  we  focus  on  Security  Exceptions,  which  is  directly  related  to  resource 
denials.  Almostall(98.96%)oftherecordedSecurityExceptions  were  observed 
on  the  devices  of  just  two  participants.  Each  of  the  remaining  participants 
encountered,  on  average,  18  SecurityExceptions  during  the  study  period  (i.e., 
roughly  3  SecurityExceptions  per  day  per  participant) . 

Almost  all  (99.93%)  SecurityExceptions  were  from  apps  attempting  to  use  the  read 
PHONE  STATE  permission,  which  is  used  to  obtain  the  phone  number.  In  the  event  of  a 
READ  PHONE  STATE  denial,  we  designed  our  implementation  to  not  supply  the  app  with 
any  phone  number  data.  We  had  considered  supplying  a  randomly-generated  phone 
number,  but  decided  against  it  due  to  potential  risks,  if  the  generated  number  were  a 
valid  phone  number  belonging  to  someoneelse. 

For  other  denials,  we  opted  to  supply  apps  with  generated  data  to  ensure  their  continued 
operation,  without  actually  exposing  private  user  data.  During  the  study  period,  the 
classifier  denied  10.34%  of  all  permission  requests;  more  than  2,000  denials  per 
participant  per  day.  Our  implementation,  however,  only  recorded  an  average  of  3 
SecurityExceptions  per  day  per  participant.  This  indicates  that  passing  synthetic  but 
well-formed  data  to  apps  in  lieu  of  actual  private  user  data  does  satisfy  app  functionality 
expectations  to  a  great  extent. 

Our  results  are  a  positive  sign  for  future  permission  systems  more  restrictive  than  the 
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current  AOFU  model:  permissions  can  be  more  restrictive  without  forcing  the  user  to 
trade  off  usability  for  improved  privacy  protection,  as  we  will  show  in  the  next  section. 
If  apps  gracefully  handle  resource  denials,  then  users  are  free  to  specify  their  privacy 
preferences  without  risking  functionality  issues. 

4.4  User  Reactions  to  Prompts 

The  use  of  runtime  prompts  was  initially  proposed  as  a  mechanism  to  obtain  better- 
informed  consent  from  users.  At  the  end  of  the  study  period,  we  conducted  exit 
interviews  with  each  participant  in  order  to  determine  the  extent  to  which  these 
assumptions  were  met. 

We  measured  how  much  participants  were  surprised  to  see  the  prompts  during  the 
course  of  the  study  period  (on  a  scale  of  l=“not  surprised”  to  5=“very  surprised”). 
Participants  expressed  an  average  rating  of  2.7.  Almost  half  (44%)  of  the  participants 
indicated  that  the  prompts  surprised  them,  and  among  them,  70%  were  surprised  at  the 
frequency  with  which  the  prompts  appeared  (up  to  4  times  per  day),  though  few 
participants  expressed  annoyance  by  that  frequency  (8.33%). 

We  asked  participants  to  rate  how  much  they  felt  that  they  were  in  control  of  resource 

usage  (on  a  scale  of  l=“nothing  changed  compared  to  default  Android”  to  5=“very  much 

in  control”).  On  average,  our  participants  rated  their  experience  as 

3.44.  Almost  half  (44%)  of  participants  felt  that  they  were  in  control  of  the  system  as  a 

result  of  the  prompts.  A  small  number  (14%)  still  felt  helpless,  regardless  of  their 

responses  to  the  prompts.  They  felt  resigned  that  applications  would  always  obtain  their 

data. 

Finally,  we  asked  participants  how  they  felt  about  the  transparency  provided  by  the 
new  system  compared  to  their  previous  Android  experiences  (on  a  scale  of  l=“nothing 
changed”  to  5=“improved  system  transparency”).  On  average,  participants  rated  system 
transparency  in  the  middle  (3).  Almost  half  (47%)  of  them  felt  that  the  new  system  was 
more  transparent.  A  minority  (14%)  mentioned  wanting  to  know  why  apps  were 
requesting  particular  sensitive  data  types. 

From  these  observations,  we  believe  that  the  new  contextual  permission  system  is  a 
positive  step  toward  improving  user  awareness.  We  believe  this  enables  users  to  make 
better  privacy  decisions  for  themselves.  Although  additional  work  is  needed  to  address 
some  negative  sentiments  about  the  current  implementation,  this  system  has  shown  to  be 
in  the  right  direction  overall. 

4.5  User  Reactions  to  Controls 

Whenever  an  automated  system  makes  decisions  on  a  user’s  behalf,  there  is  the 
inevitable  risk  that  the  system  will  make  an  incorrect  decision.  In  our  case  this  can 
cause  apps  to  be  over-privileged  and  risk  privacy  violations,  or  be  under-privileged  and 
risk  app  failure  or  reduced  functionality.  It  is  important  to  empower  users  so  they  can 
easily  audit  the  decisions  that  were  made  on  their  behalf  and  to  amend  those  decisions 
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that  are  not  aligned  with  their  preferences. 

In  our  implementation,  we  built  a  user  interface  based  on  our  prior  validated  prototypes 
[17].  This  system  allowed  our  participants  to  view  automated  permissions  decisions 
made  by  the  classifier,  as  well  as  set  privacy  preferences  with  respect  to  context  (i.e., 
the  visibility  of  the  requesting  app).  We  included  this  user  interface  as  part  of  the 
operating  system,  as  a  panel  within  the  system  settings  app. 

When  we  on-boarded  our  participants,  we  mentioned  to  them  that  there  was  a  new 
“permission  manager”  available,  but  to  avoid  priming  them,  we  made  sure  not  to 
emphasize  it  in  any  particular  way.  Our  instrumented  platform  logged  every  time 
participants  interacted  with  our  permission  manager  to  understand  how  they  used  it. 

Fifteen  of  the  37  participants  (40.5%)  opened  the  permission  manager  during  the  study 
period.  Our  implementation  logged  a  total  of  169  preference  changes  across  these 
participants.  Only  6  out  of  37  participants  (16.2%)  changed  the  settings  to  be  more 
restrictive.  Of  the  adjustments  made  towards  more  restrictiveness,  the  majority  were 
for  the  GET  ACCOUNTS  permission,  which  prevents  apps  from  reading  the  user’s  stored 
credential  data  (e.g.,  usernames  linked  to  accounts  on  the  device,  such  as  for  Google, 
Twitter,  etc.).  In  contrast,  the  most-common  permission  that  participants  adjusted  to  be 
more  permissive  was  READ  CONTACTS.  When  asked  for  their  motives  behind  these 
changes,  the  majority  of  participants  said  that  functionality  was  their  main  reason  for 
granting  more  access,  and  the  sensitivity  of  data  for  restricting  access. 

We  also  asked  participants  to  demonstrate  how  they  would  change  the  settings  of  a 
familiar  app  to  only  be  able  to  access  their  location  when  they  are  using  that  app.  We 
based  this  task  off  of  one  of  the  evaluation  tasks  we  performed  in  the  dashboard 
evaluation  experiment  [17],  where  we  performed  an  online  study  to  evaluate  a  low- 
fidelity  prototype  of  the  design  on  which  we  based  our  user  interface.  All  but  two  of 
our  participants  were  able  to  correctly  complete  this  task  using  the  user  interface. 
Participants  rated  the  average  ease  of  the  task  as  1.15  (on  a  scale  from  l=“very  easy”  to 
5=“very  hard”).  We  conclude  that  participants  are  able  to  understand  the  permission 
interface  after  having  used  it  for  a  week,  and  without  special  instructions. 

The  permission  manager  also  enables  users  to  diagnose  application  crashes  that  result 
from  a  resource  denial  (a  feature  not  present  in  the  original  design  on  which  we  based 
it).  In  exit  interviews,  we  examined  how  participants  responded  to  app  crashes  in  their 
experiences  with  the  device.  The  majority  of  participants  reported  that  their  first  step 
was  to  restart  the  app  that  had  crashed.  If  that  was  unsuccessful,  they  would  then  restart 
their  phone.  This  informs  the  design  of  a  future  system:  if  an  app  crashes  as  a  result  of 
a  resource  denial,  the  platform  should  clearly  communicate  this  to  users  or  otherwise 
automatically  adjust  the  permissions  on  their  behalf.  This  could  be  communicated 
through  a  dialog  or  in  the  notification  bar. 
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4.6  Discussion 

The  core  objective  of  our  37-person  field  study  was  to  analyze  how  a  contextually- 
aware,  more-restrictive  permission  model  performs  in  the  wild,  thereby  validating  the 
work  we  had  performed  in  this  project.  We  examined  how  participants  balanced  their 
privacy  preferences  with  app  functionality.  This  measures  the  real-world  applicability 
of  predicting  user  privacy  decisions  with  the  help  of  contextual  cues  surrounding  each 
permission  request. 

4.6.1  Consequential  Denial 

Overall,  participants  denied  24%  of  all  prompted  permission  requests.  This  is  a  60% 
reduction  in  denials  compared  to  the  results  we  attained  at  the  onset  of  the  project, 
when  evaluating  an  offline  classifier  [19],  which  did  not  enforce  the  user’s  decision  to 
deny  a  permission  and  prompted  the  user  using  only  hypothetical  language:  “given  the 
choice,  would  you  have  denied...?”  The  decreased  denial  rate  we  observed  is  therefore 
unsurprising  given  that  participants  were  now  actually  making  a  tradeoff  between 
functionality  and  privacy,  instead  of  expressing  the  degree  to  which  privacy  is  important 
to  them.  Our  results  show  that  even  in  the  presence  of  consequential  resource  denial, 
contextual  cues  helped  to  predict  users’  privacy  decisions  and  better  aligned  permission 
settings  with  their  expectations,  as  compared  to  the  status  quo. 

4.6.2  Ask  on  First  Use 

Our  results  corroborate  our  initial  work  [18,  19]  in  showing  that  AOFU’s  inability  to 
capture  the  context  surrounding  users’  decisions  is  a  cause  of  AOFU’s  significant  error 
rate,  and  based  on  our  qualitative  interviews,  provides  serious  concerns  when  used  in 
high-risk  BYOD  environments.  We  also  found  that  a  significant  portion  of  participants 
do  not  have  an  adequate  understanding  of  how  AOFU  works,  which  further  limits 
AOFU’s  utility:  11  participants  did  not  realize  that  their  prompt  responses  for  AOFU 
are  taken  as  permanent  decisions;  and  4  participants  interpreted  the  prompts  as  yet 
another  mechanism  for  collecting  user  data  instead  of  as  a  privacy-protection 
mechanism.  While  the  actual  impact  of  these  inaccurate  beliefs  is  yet  to  be  explored, 
we  believe  that  these  issues  need  to  be  fixed  in  the  future,  in  order  to  increase  Android’s 
ability  to  predict  and  protect  user  data  effectively. 

4.6.3  Implementation  Limitations 

While  our  new  permission  model  reduces  the  number  of  mis -predictions  compared  to 
AOFU  by  50%,  our  offline  analysis  shows  that  it  has  the  potential  to  reduce  mis¬ 
predictions  by  75%.  A  further  examination  revealed  that  the  performance  difference  is 
due  to  the  bootstrapping  of  the  training  dataset  in  the  implementation.  We  note  that 
difference  is  not  inherent  to  running  a  classifier  in  Android,  and  so  simply  modifying 
our  implementation  to  use  these  improvements  will  allow  it  to  achieve  the  same 
performance. 
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4.6.4  Purpose 

While  our  new  permission  model  outperforms  AOFU,  it  still  does  not  explain  to  the 
user  why  an  app  needs  to  use  a  permission.  In  our  exit  interviews,  we  observed  that  14% 
of  participants  expressed  the  desire  to  know  why  apps  made  a  request  in  the  first  place. 
Previous  work  has  shown  that  app  functionality  is  a  key  factor  in  permission  decisions 
[3].  If  users  were  properly  informed  of  the  functionality  requirement  behind  a 
permission  request,  then  they  might  be  better  positioned  to  make  decisions  that  meet 
their  privacy  and  functionality  expectations. 

We  believe  that  there  are  ways  to  extend  contextual  permission  systems  by 
incorporating  the  actual  purpose  of  the  request.  For  example,  after  introducing  AOFU 
permissions.  Android  started  encouraging  app  developers  to  provide  the  reason  behind 
their  permission  requests  so  that  the  user  can  include  that  in  the  decision  making 
process  [5].  Tan  et  al.  [16]  showed  that  similar  prompts  on  iOS  actually  resulted  in 
users  being  more  permissive  about  granting  permissions  to  apps.  Similarly,  prior  work 
has  attempted  to  use  static  analysis  to  automatically  incorporate  inferred  purpose  [13, 
12]. 

4.6.5  Resource  Denial 

When  deploying  more-restrictive  permission  systems,  it  is  important  that  apps  continue 
to  run  without  entering  into  an  error  state  that  results  from  a  resource  denial.  Users 
should  be  able  to  select  their  privacy  preferences  with  minimal  disruption  to  their 
experience;  apps  must  not  be  able  to  force  an  ultimatum  by  simply  not  functioning  if  a 
permission  is  denied.  Indeed,  some  participants  simply  allow  most  permission  requests 
because  that  ensures  their  apps  run  properly. 

The  platform,  therefore,  is  responsible  to  ensure  that  apps  handle  resource  denials 
gracefully.  To  their  credit,  when  Android  introduced  AOFU,  it  implemented  some 
permission  denials  to  appear  like  a  lack  of  available  data  or  the  non-existence  of 
hardware,  instead  of  throwing  a  SecurityException.  In  our  implementation,  we  take 
the  extra  step  of  supplying  apps  with  generic  but  well-formed  data  in  the  event  of  a 
denial.  We  observed  that  our  participants  tended  to  deny  more  permissions  as  they 
progressed  through  the  study  period  (on  average  20%  denial  in  the  learning  phase 
versus  a  26%  denial  rate  during  the  validation  phase).  Those  participants  also 
experienced  a  low  rate  of  app  failures  due  to  resource  denials.  In  the  future,  platforms 
should  implement  measures  to  reduce  functionality  losses  stemming  from  having 
stricter  privacy  preferences.  Failing  to  do  so  might  otherwise  compel  users  to 
compromise  on  their  privacy  preferences  for  the  sake  of  functionality. 

4.6.6  Remedying  Unexpected  Behavior 

Regardless  of  any  mitigations  to  avoid  app  crashes,  it  is  practical  to  assume  that  apps 
will  crash  when  they  fail  to  receive  expected  data  under  certain  circumstances.  One 
way  to  remedy  this  is  to  give  users  tools  to  adjust  the  behavior  of  the  permission  system. 
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such  as  being  able  to  be  more  permissive  to  certain  applications  in  certain  contexts. 
This  approach,  however,  assumes  that  (i)  users  accurately  attribute  a  crash  event  to  a 
resource  denial,  which  may  not  always  be  the  case,  and  (ii)  users  are  sufficiently 
technical  to  identify  which  resource  denial  caused  the  crash.  In  our  implementation  of  a 
new  permission  manager,  we  address  the  latter  assumption  by  providing  users  a 
timeline  of  recent  decisions  made  by  the  new  permission  system,  which  can  be  used  to 
deduce  the  cause  of  a  crash. 

Our  exit  interviews  showed  that  few  participants  would  think  to  check  the  permission 
manager  following  an  application  crash,  so  clearly  more  work  is  needed  here.  With 
proposals  for  more-accurate  and  more -restrictive  permission  models,  it  is  necessary  to 
have  usable  mechanisms  to  deal  with  inevitable  crashes  due  to  resource  denials.  The 
platform  should  provide  mechanisms  either  to  help  the  user  diagnose  and  resolve  such 
crashes,  or  to  automatically  fix  permissions  on  a  temporary  basis  and  give  the  user  an 
option  to  make  the  fix  permanent. 

5  CONCLUSION 

Our  validation  study  shows  how  applications  and  users  respond  to  a  real-world 
deployment  of  a  novel  contextually-aware  permission  model,  which  we  developed 
based  on:  i)  soliciting  feedback  from  “extreme  users”  in  BYOD  environment,  ii) 
iterative  prototyping  on  the  user  interface  dashboard,  and  iii)  the  development  of  a 
privacy  preferences  classifier  using  offline  training.  The  new  permission  system  based 
on  these  components  significantly  reduced  the  error  rate  from  that  of  the  prevailing 
“ask-on-first-use”  model  first  deployed  in  Android  6.0.  While  prior  work  already 
demonstrated  ways  to  increase  the  protection  provided  by  new  permission  models,  we 
believe  our  study  provides  opportunities  to  further  improve  performance  and  address 
practical  limitations  in  actual  implementations. 
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Abstract 

We  instrumented  the  Android  platform  to  collect  data  re¬ 
garding  how  often  and  under  what  circumstances  smart¬ 
phone  applications  access  protected  resources  regulated 
by  permissions.  We  performed  a  36-person  field  study  to 
explore  the  notion  of  “contextual  integrity,”  i.e.,  how 
often  applications  access  protected  resources  whenusers 
are  not  expecting  it.  Based  on  our  collection  of  27M  data 
points  and  exit  interviews  with  participants,  we  exam¬ 
ine  the  situations  in  which  users  would  like  the  ability  to 
deny  applications  access  to  protected  resources.  At  least 
80%  of  our  participants  would  have  preferred  to  prevent 
at  least  one  permission  request,  and  overall,  they  stated  a 
desire  to  block  over  a  third  of  all  requests.  Our  findings 
pave  the  way  for  future  systems  to  automatically  deter¬ 
mine  the  situations  in  which  users  would  want  to  be  con¬ 
fronted  with  security  decisions. 

1  Introduction 

Mobile  platform  permission  models  regulate  how  appli¬ 
cations  access  certain  resources,  such  as  users’  personal 
information  or  sensor  data  (e.g.,  camera,  GPS,  etc.).  For 
instance,  previous  versions  of  Android  prompt  the  user 
during  application  installation  with  a  list  of  all  the  per¬ 
missions  that  the  application  may  use  in  the  future;  if  the 
user  is  uncomfortable  granting  any  of  these  requests,  her 
only  option  is  to  discontinue  installation  [3].  OniOS  and 
Android  M,  the  user  is  prompted  at  runtime  the  first  time 
an  application  requests  any  of  a  handful  of  data  types, 
such  as  location,  address  book  contacts,  or  photos  [34]. 

Research  has  shown  that  few  people  read  the  Android 
install-time  permission  requests  and  even  fewercompre- 
hend  them  [16].  Another  problem  is  habituation;  on  av¬ 
erage,  Android  applications  present  the  user  with  four 
permission  requests  during  the  installation  process  [13]. 
While  iOS  users  are  likely  to  see  fewer  permission  re¬ 
quests  than  Android  users,  because  there  are  fewer  pos¬ 
sible  permissions  and  they  are  only  displayed  the  first 


time  the  data  is  actually  requested,  it  is  not  clear  whether 
or  not  users  are  being  prompted  about  access  to  data  that 
they  actually  find  concerning,  or  whether  they  would  ap¬ 
prove  of  subsequent  requests  [15]. 

Nissenbaum  posited  that  the  reason  why  most  privacy 
models  fail  to  predict  violations  is  that  they  fail  to  con¬ 
sider  contextual  integrity  [32].  That  is,  privacy  violations 
occur  when  personal  information  is  used  in  ways  that 
defy  users’  expectations.  We  believe  that  this  notion  of 
“privacy  as  contextual  integrity”  can  be  applied  to  smart¬ 
phone  permission  systems  to  yield  more  effective  per¬ 
missions  by  only  prompting  users  when  an  application’s 
access  to  sensitive  data  is  likely  to  defy  expectations.  As 
a  first  step  down  this  path,  we  examined  how  applica¬ 
tions  are  currently  accessing  this  data  and  then  examined 
whether  or  not  it  complied  with  users’  expectations. 

We  modified  Android  to  log  whenever  an  application 
accessed  a  permission-protected  resource  and  then  gave 
these  modified  smartphones  to  36  participants  who  used 
them  as  their  primary  phones  for  one  week.  The  pur¬ 
pose  of  this  was  to  perform  dynamic  analysis  to  deter¬ 
mine  how  often  various  applications  are  actually  access¬ 
ing  protected  resources  under  realistic  circumstances. 
Afterwards,  subjects  returned  the  phones  to  our  labora¬ 
tory  and  completed  exit  surveys.  We  showed  them  vari¬ 
ous  instances  over  the  past  week  where  applications  had 
accessed  certain  types  of  data  and  asked  whether  those 
instances  were  expected,  and  whether  they  would  have 
wanted  to  deny  access.  Participants  wanted  to  block  a 
third  of  the  requests.  Their  decisions  were  governed  pri¬ 
marily  by  two  factors:  whether  they  had  privacy  concerns 
surrounding  the  specific  data  type  and  whether  they  un¬ 
derstood  why  the  application  needed  it. 

We  contribute  the  following; 

•  To  our  knowledge,  we  performed  the  first  field  study 
to  quantify  the  permission  usage  by  third-party  ap¬ 
plications  under  realistic  circumstances. 
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•  We  show  that  our  participants  wanted  to  block  ac¬ 
cess  to  protected  resources  a  third  of  the  time.  This 
suggests  that  some  requests  should  be  granted  by 
runtime  consent  dialogs,  rather  than  Android’s  pre¬ 
vious  all-or-nothing  install-time  approval  approach. 

•  We  show  that  the  visibility  of  the  requesting  appli¬ 
cation  and  the  frequency  at  which  requests  occur  are 
two  important  factors  which  need  to  be  taken  into 
account  in  designing  a  runtime  consent  platform. 

2  Related  Work 

While  users  are  required  to  approve  Android  application 
permission  requests  during  installation,  most  do  not  pay 
attention  and  fewer  comprehend  these  requests  [16,  26]. 
In  fact,  even  developers  are  not  fully  knowledgeable 
about  permissions  [40],  and  are  given  a  lot  of  freedom 
when  posting  an  application  to  the  Google  Play  Store  [7]. 
Applications  often  do  not  follow  the  principle  of  least 
privilege,  intentionally  or  unintentionally  [44].  Other 
work  has  suggested  improving  the  Android  per-  mission 
model  with  better  definitions  and  hierarchical 
breakdowns  [8].  Some  researchers  have  experimented 
with  adding  fine-grained  access  control  to  the  Android 
model  [11].  Providing  users  with  more  privacy  informa¬ 
tion  and  personal  examples  has  been  shown  to  help  users 
in  choosing  applications  with  fewer  permissions  [21,27]. 

Previous  work  has  examined  the  overuse  of  permissions 
by  applications  [13,  20],  and  attempted  to  identify  mali¬ 
cious  applications  through  their  permission  requests  [36] 
or  through  natural  language  processing  of  application  de¬ 
scriptions  [35].  Researchers  have  also  developed  static 
analysis  tools  to  analyze  Android  permission  specifica¬ 
tions  [6,  9,  13].  Our  work  complements  this  static  anal¬ 
ysis  by  applying  dynamic  analysis  to  permission  us-  age. 
Other  researchers  have  applied  dynamic  analysis  to 
native  (non- Java)  APIs  among  third-party  mobile  mar¬ 
kets  [39];  we  apply  it  to  the  Java  APIs  available  to  devel¬ 
opers  in  the  Google  Play  Store. 

Researchers  examined  user  privacy  expectations  sur¬ 
rounding  application  permissions,  and  found  that  users 
were  often  surprised  by  the  abilities  of  background  ap¬ 
plications  to  collect  data  [25, 42].  Their  level  of  con-  cern 
varied  from  annoyance  to  seeking  retribution  when 
presented  with  possible  risks  associated  with  permis¬ 
sions  [15].  Some  studies  employed  crowdsourcing  to 
create  a  privacy  model  based  on  user  expectations  [30]. 

Researchers  have  designed  systems  to  track  or  reduce 
privacy  violations  by  recommending  applications  based 
on  users’  security  concerns  [2,  12,  19,  24,  28,  46^8]. 
Other  tools  dynamically  block  runtime  permission  re¬ 
quests  [37].  Enck  et  al.  found  that  a  considerable  number 
of  applications  transmitted  location  or  other  user  data  to 


third  parties  without  requiring  user  consent  [12].  Horny- 
ack  et  al.’s  AppFence  system  gave  users  the  ability  to 
deny  data  to  applications  or  substitute  fake  data  [24]. 
However,  this  broke  application  functionality  for  one- 
third  of  the  applications  tested. 

Reducing  the  number  of  security  decisions  a  user  must 
make  is  likely  to  decrease  habituation,  and  therefore,  it  is 
critical  to  identify  which  security  decisions  users  should 
be  asked  to  make.  Based  on  this  theory.  Felt  et  al.  created 
a  decision  tree  to  aid  platform  designers  in  determining 
the  most  appropriate  permission-granting  mechanism  for 
a  given  resource  (e.g.,  access  to  benign  resources  should 
be  granted  automatically,  whereas  access  to  dangerous 
resources  should  require  approval)  [14].  They  concluded 
that  the  majority  of  Android  permissions  can  be  automat¬ 
ically  granted,  but  16%  (corresponding  to  the  12  permis¬ 
sions  in  Table  1)  should  be  granted  via  runtime  dialogs. 

Nissenbaum’s  theory  of  contextual  integrity  can  help  us 
to  analyze  “the  appropriateness  of  a  flow”  in  the  context 
of  permissions  granted  to  Android  applications  [32]. 
There  is  ambiguity  in  defining  when  an  application  actu¬ 
ally  needs  access  to  user  data  to  run  properly.  It  is  quite 
easy  to  see  why  a  location-sharing  application  would 
need  access  to  GPS  data,  whereas  that  same  request  com¬ 
ing  from  a  game  like  Angry  Birds  is  less  obvious.  “Con¬ 
textual  integrity  is  preserved  if  information  flows  accord¬ 
ing  to  contextual  norms”  [32],  however,  the  lack  of  thor¬ 
ough  documentation  on  the  Android  permission  model 
makes  it  easier  for  programmers  to  neglect  these  norms, 
whether  intentionally  or  accidentally  [38].  Deciding  on 
whether  an  application  is  violating  users’  privacy  can  be 
quite  complicated  since  “the  scope  of  privacy  is  wide- 
ranging”  [32].  To  that  end,  we  performed  dynamic  analy¬ 
sis  to  measure  how  often  (and  under  what  circumstances) 
applications  were  accessing  protected  resources,  whether 
this  complied  with  users’  expectations,  as  well  as  how 
often  they  might  be  prompted  if  we  adopt  Felt  et  al.’s 
proposal  to  require  runtime  user  confirmation  before  ac¬ 
cessing  a  subset  of  these  resources  [14].  Finally,  we  show 
how  it  is  possible  to  develop  a  classifier  to  automatically 
determine  whether  or  not  to  prompt  the  user  based  on 
varying  contextual  factors. 

3  Methodology 

Our  long-term  research  goal  is  to  minimize  habituation 
by  only  confronting  users  with  necessary  security  de¬ 
cisions  and  avoiding  showing  them  permission  requests 
that  are  either  expected,  reversible,  or  unconcerning.  Se¬ 
lecting  which  permissions  to  ask  about  requires  under¬ 
standing  how  often  users  would  be  confronted  with  each 
type  of  request  (to  assess  the  risk  of  habituation)  and  user 
reactions  to  these  requests  (to  assess  the  benefit  to  users). 
In  this  study,  we  explored  the  problem  space  in  two  parts: 
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we  instrumented  Android  so  that  we  could  collect  actual 
usage  data  to  understand  how  often  access  to  various  pro¬ 
tected  resources  is  requested  by  applications  in  practice, 
and  then  we  surveyed  our  participants  to  understand  the 
requests  that  they  would  not  have  granted,  if  given  the 
option.  This  field  study  involved  36  participants  over  the 
course  of  one  week  of  normal  smartphone  usage.  In  this 
section,  we  describe  the  log  data  that  we  collected,  our 
recruitment  procedure,  and  then  our  exit  survey. 

3.1  Tracking  Access  to  Sensitive  Data 

In  Android,  when  applications  attempt  to  access  pro¬ 
tected  resources  (e.g.,  personal  information,  sensor  data, 
etc.)  at  runtime,  the  operating  system  checks  to  see 
whether  or  not  the  requesting  application  was  previously 
granted  access  during  installation.  We  modified  the  An¬ 
droid  platform  to  add  a  logging  framework  so  that  we 
could  determine  every  time  one  of  these  resources  was 
accessed  by  an  application  at  runtime.  Because  our  target 
device  was  a  Samsung  Nexus  S  smartphone,  we  modified 
Android  4.1.1  (Jellybean),  which  was  the  newest  version 
of  Android  supported  by  our  hardware. 

3.1.1  Data  Collection  Architecture 

Our  goal  was  to  collect  as  much  data  as  possible  about 
each  applications’  access  to  protected  resources,  while 
minimizing  our  impact  on  system  performance.  Our  data 
collection  framework  consisted  of  two  main  com¬ 
ponents:  a  series  of  “producers”  that  hooked  various  An¬ 
droid  API  calls  and  a  “consumer”  embedded  in  the  main 
Android  framework  service  that  wrote  the  data  to  a  log 
file  and  uploaded  it  to  our  collection  server. 

We  logged  three  kinds  of  permission  requests.  First,  we 
logged  function  calls  checked  by  checkPermission  ( ) 
in  the  Android  Context  implementation.  Instru¬ 
menting  the  Context  implementation,  instead  of  the 
ActivityManagerService  or  PackageManager,  al¬ 
lowed  us  to  also  log  the  function  name  invoked  by  the 
user-space  application.  Next,  we  logged  access  to  the 
ContentProvider  class,  which  verifies  the  read  and 
write  permissions  of  an  application  prior  to  it  accessing 
structured  data  (e.g.,  contacts  or  calendars)  [5].  Finally, 
we  tracked  permission  checks  during  Intent  transmis¬ 
sion  by  instrumenting  the  ActivityManagerService 
and  BroadcastQueue.  Intents  allow  an  application  to 
pass  messages  to  another  application  when  an  activity  is 
to  be  performed  in  that  other  application  (e.g.,  opening  a 
URL  in  the  web  browser)  [4]. 

We  created  a  component  called  Producer  thatfetches 
the  data  from  the  above  instrumented  points  and  sends  it 
back  to  the  Consumer,  which  is  responsible  for  logging 
everything  reported.  Producers  are  scattered  across 
the  Android  Platform,  since  permission  checks  occur  in 


multiple  places.  The  Producer  that  logged  the  most 
data  was  in  system“server  and  recorded  direct  func¬ 
tion  calls  to  Android’s  Java  API.  For  a  majority  of  priv¬ 
ileged  function  calls,  when  a  user  application  invokes  the 
function,  it  sends  the  request  to  system  server  via 
Binder.  Binder  is  the  most  prominent  IPC  mech¬ 
anism  implemented  to  communicate  with  the  Android 
Platform  (whereas  Intents  communicate  between  ap¬ 
plications).  For  requests  that  do  not  make  IPC  calls  to  the 
system  server, a  Producer  is  placed  in  the  user  appli¬ 
cation  context  (e.g.,  in  the  case  of  ContentProviders). 

The  Consumer  class  is  responsible  for  logging  data  pro¬ 
duced  by  each  Producer.  Additionally,  the  Consumer 
also  stores  contextual  information,  which  we  describe  in 
Section  3.1.2.  The  Cons  ume  r  syncs  data  with  the  filesys¬ 
tem  periodically  to  minimize  impact  on  system  perfor¬ 
mance.  All  log  data  is  written  to  the  internal  storage  of 
the  device  because  the  Android  kernel  is  not  allowed  to 
write  to  external  storage  for  security  reasons.  Although 
this  protects  our  data  from  curious  or  careless  users,  it 
also  limits  our  storage  capacity.  Thus,  we  compressed 
the  log  files  once  every  two  hours  and  upload  them  to  our 
collection  servers  whenever  the  phone  had  an  active 
Internet  connection  (the  average  uploaded  and  zipped  log 
file  was  around  108KB  and  contained  9,000  events). 

Due  to  the  high  volume  of  permission  checks  we  en¬ 
countered  and  our  goal  of  keeping  system  performance 
at  acceptable  levels,  we  added  rate-limiting  logic  to  the 
Consumer.  Specifically,  if  it  has  logged  permission 
checks  for  a  particular  application/permission  combina¬ 
tion  more  than  10,000  times,  it  examines  whether  it  did 
so  while  exceeding  an  average  rate  of  1  permission  check 
every  2  seconds.  If  so,  the  Consumer  will  only  record 
10%  of  all  future  requests  for  this  application/permission 
combination.  When  this  rate-limiting  is  enabled,  the 
Cons  ume  r  tracks  these  application/permission  combina¬ 
tions  and  updates  all  the  Producers  so  that  they  start 
dropping  these  log  entries.  Finally,  the  Cons  ume  r  makes 
a  note  of  whenever  this  occurs  so  that  we  can  extrapolate 
the  true  number  of  permission  checks  that  occurred. 

3.1.2  Data  Collection 

We  hooked  the  permission-checking  APIs  so  that  every 
time  the  system  checked  whether  an  application  had  been 
granted  a  particular  permission,  we  logged  the  name  of 
the  permission,  the  name  of  the  application,  and  the  API 
method  that  resulted  in  the  check.  In  addition  to  times¬ 
tamps,  we  collected  the  following  contextual  data: 

•  Visibility — We  categorized  whether  the  requesting 
application  was  visible  to  the  user,  using  four  cate¬ 
gories:  running  (a)  as  a  service  with  no  user  inter¬ 
action;  (b)  as  a  service,  but  with  user  interaction  via 
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notifications  or  sounds;  (c)  as  a  foreground  process, 
but  in  the  background  due  to  multitasking;  or  (d)  as 
a  foreground  process  with  direct  user  interaction. 

•  Screen  Status — Whether  the  screen  wason/off. 

•  Connectivity — The  phone’s  WiFi  connection  state. 

•  Location — The  user’s  last  known  coordinates.  In 
order  to  preserve  battery  life,  we  collected  cached 
location  data,  rather  than  directly  querying  the  GPS. 

.  View — The  UI  elements  in  the  requesting  applica¬ 
tion  that  were  exposed  to  the  user  at  the  time  that  a 
protected  resource  was  accessed.  Specifically,  since 
the  UI  is  built  from  an  XML  file,  we  recorded  the 
name  of  the  screen  as  defined  in  the  DOM. 

«  History — A  list  of  applications  with  which  the  user 
interacted  prior  to  the  requesting  application. 

.  Path — When  access  to  a  ContentProvider  object 
was  requested,  the  path  to  the  specific  content. 


Felt  et  al.  proposed  granting  most  Android  permissions 
without  a  priori  user  approval  and  granting  12  permis¬ 
sions  (Table  1)  at  runtime  so  that  users  have  contextual 
information  to  infer  why  the  data  might  be  needed  [14]. 
The  idea  is  that,  if  the  user  is  asked  to  grant  a  permission 
while  using  an  application,  she  may  have  some  under¬ 
standing  of  why  the  application  needs  that  permission 
based  on  what  she  was  doing.  We  initially  wanted  to 
perform  experience  sampling  by  probabilistically  ques¬ 
tioning  participants  whenever  any  of  these  12  permis¬ 
sions  were  checked  [29].  Our  goal  was  to  survey  par¬ 
ticipants  about  whether  access  to  these  resources  was  ex¬ 
pected  and  whether  it  should  proceed,  but  we  were  con¬ 
cerned  that  this  would  prime  them  to  the  security  focus 
of  our  experiment,  biasing  their  subsequent  behaviors. 
Instead,  we  instrumented  the  phones  to  probabilistically 
take  screenshots  of  what  participants  were  doing  when 
these  12  permissions  were  checked  so  that  we  could  ask 
them  about  it  during  the  exit  survey.  We  used  reservoir 
sampling  to  minimize  storage  and  performance  impacts, 
while  also  ensuring  that  the  screenshots  covered  a  broad 
set  of  applications  and  permissions  [43]. 


Figure  1  shows  a  screenshot  captured  during  the  study 
along  with  its  corresponding  log  entry.  The  user  was 
playing  the  Solitaire  game  while  Spotify  requested  a 
WiFi  scan.  Since  this  permission  was  of  interest  (Table 
1),  our  instrumentation  took  a  screenshot.  Since  Spotify 
was  not  the  application  the  participant  was  interacting 
with,  its  visibility  was  set  Xo  false.  The  history  shows  that 
prior  to  Spotify  calling  getScanResults  { ) ,  the  user 
had  viewed  Solitaire,  the  call  screen,  the  launcher,  and 
the  list  of  MMS  conversations. 


Permission  Type 

Activity 

WRITE  SYNC  ^ 
SETTINGS 

Change  application  sync  settings 
when  the  user  is  roaming 

ACCESS  WIFI" 
STATE 

View  nearhy  SSIDs 

INTERNET 

Access  Internet  when  roaming 

NFC 

Communicate  via  NFC 

READTIISTORY- 

BOOKMARKS 

Read  users’  browser  history 

ACCESS-FINE- 

LOCATION 

Read  GPS  location 

ACCESS-COARSE" 

LOCATION 

Read  network-inferred  location 
(i.e.,  cell  tower  and/or  WiFi) 

LOCATION- 

HARDWARE 

Directly  access  GPS  data 

READ- CALL-LOG 

Read  call  history 

ADD-VOICEMAIL 

Read  call  history 

READ  ISMS 

Read  sent/received/draft  SMS 

SEND^MS 

Send  SMS 

Table  1;  The  12  permissions  that  Felt  et  al.  recommend 
be  granted  via  runtime  dialogs  [14].  We  randomly  took 
screenshots  when  these  permissions  were  requested  by 
applications,  and  we  asked  about  them  in  our  exit  survey. 


3.2  Recruitment 

We  placed  an  online  recruitment  advertisement  on 
Craigslist  in  October  of  2014,  under  the  “et  cetera  jobs” 
section.^  The  title  of  the  advertisement  was  “Research 
Study  on  Android  Smartphones,”  and  it  stated  that  the 
study  was  about  how  people  interact  with  their  smart¬ 
phones.  We  made  no  mention  of  security  or  privacy. 
Those  interested  in  participating  were  directed  to  an  on¬ 
line  consent  form.  Upon  agreeing  to  the  consent  form, 
potential  participants  were  directed  to  a  screening  appli¬ 
cation  in  the  Google  Play  store.  The  screening  applica¬ 
tion  asked  for  information  about  each  potential  partici¬ 
pant’s  age,  gender,  smartphone  make  and  model.  It  also 
collected  data  on  their  phones’  internal  memory  size  and 
the  installed  applications.  We  screened  out  applicants 
who  were  under  1 8  years  of  age  or  used  providers  other 
than  T-Mobile,  since  our  experimental  phones  could  not 
attain  3G  speeds  on  other  providers.  We  collected  data  on 
participants’  installed  applications  so  that  we  could  pre¬ 
install  free  applications  prior  to  them  visiting  our  labo¬ 
ratory.  (We  copied  paid  applications  from  their  phones, 
since  we  could  not  download  those  ahead  of  time.) 

We  contacted  participants  who  met  our  screening  re¬ 
quirements  to  schedule  a  time  to  do  the  initial  setup. 
Overall,  48  people  showed  up  to  our  laboratory,  and  of 
those,  40  qualified  (8  were  rejected  because  our  screen¬ 
ing  application  did  not  distinguish  some  Metro  PCS  users 

'Approved  by  the  UC  Berkeley  IRB  under  protocol  #2013-02-4992 
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(a)  Screenshot 


Name 

Log  Data 

Type 

APIFUNC 

Permission 

ACCESS  WIFI  STATE 

App  Name 

com.  spotify. music 

Timestamp 

I4I2888326273 

API  Function 

getScanResultsO 

Visibility 

FALSE 

Screen  Status 

SCREEN  ON 

Connectivity 

NOTGONNECTED 

Location 

Lat  37.XXX  Long  -I22.XXX  - 
I4I253868664I  (Time  it  was  updated) 

View 

com.mobilityware.solitaire/.Solitaire 

History 

com.android.pbone/.InCallScreen 
com.  android  .launcber/com.  android .  - 
launcber2.  Launcher 

com.android.mms/ConversationList 

(b)  Corresponding  log  entry 


Figure  1 :  Screenshot  (a)  and  corresponding  log  entry  (b) 
captured  during  the  experiment. 

from  T-Mobile  users).  In  the  email,  we  noted  that  due  to 
the  space  constraints  of  our  experimental  phones,  we 
might  not  be  able  to  install  all  the  applications  on  their 
existing  phones,  and  therefore  they  needed  to  make  a 
note  of  the  ones  that  they  planned  to  use  that  week.  The 
initial  setup  took  roughly  30  minutes  and  involved  trans¬ 
ferring  their  SIM  cards,  helping  them  set  up  their  Google 
and  other  accounts,  and  making  sure  they  had  all  the  ap¬ 
plications  they  needed.  We  compensated  each  participant 
with  a  $35  gift  card  for  showing  up  at  the  setup  session. 
Out  of  40  people  who  were  given  phones,  2  did  not  re¬ 
turn  them,  and  2  did  not  regularly  use  them  during  the 
study  period.  Of  our  36  remaining  participants  who  used 
the  phones  regularly,  19  were  male  and  17  were  female; 
ages  ranged  from  20  to  63  years  old  (jU  =  32,  S  =  1 1). 

After  the  initial  setup  session,  participants  used  the  ex¬ 
perimental  phones  for  one  week  in  lieu  of  their  normal 
phones.  They  were  allowed  to  install  and  uninstall  appli¬ 


cations,  and  we  instructed  them  to  use  these  phones  as 
they  would  their  normal  phones.  Our  logging  framework 
kept  track  of  every  protected  resource  accessed  by  a  user- 
level  application  along  with  the  previously-mentioned 
contextual  data.  Due  to  storage  constraints  on  the  de¬ 
vices,  our  software  uploaded  log  files  to  our  server  every 
two  hours.  However,  to  preserve  participants’  privacy, 
screenshots  remained  on  the  phones  during  the  course  of 
the  week.  At  the  end  of  the  week,  each  participant 
returned  to  our  laboratory,  completed  an  exit  survey,  re¬ 
turned  the  phone,  and  then  received  an  additional  $100 
gift  card  (i.e.,  slightly  more  than  the  value  of  the  phone). 

3.3  Exit  Survey 

When  participants  returned  to  our  laboratory,  they  com¬ 
pleted  an  exit  survey.  The  exit  survey  software  ran  on  a 
laptop  in  a  private  room  so  that  it  could  ask  questions 
about  what  they  were  doing  on  their  phones  during  the 
course  of  the  week  without  raising  privacy  concerns.  We 
did  not  view  their  screenshots  until  participants  gave  us 
permission.  The  survey  had  three  components: 

•  Screenshots — Our  software  displayed  a  screenshot 
taken  after  one  of  the  12  resources  in  Table  1  was 
accessed.  Next  to  the  screenshot  (Figure  2a),  we 
asked  participants  what  they  were  doing  on  the 
phone  when  the  screenshot  was  taken  (open-ended). 
We  also  asked  them  to  indicate  which  of  severalac- 
tions  they  believed  the  application  was  performing, 
chosen  from  a  multiple-choice  list  of  permissions 
presented  in  plain  language  (e.g.,  “reading  browser 
history,”  “sending  a  SMS,”  etc.).  After  answering 
these  questions,  they  proceeded  to  a  second  page  of 
questions  (Figure  2b).  We  informed  participants  at 
the  top  of  this  page  of  the  resource  that  the  appli¬ 
cation  had  accessed  when  the  screenshot  was  taken, 
and  asked  them  to  indicate  how  much  they  expected 
this  (5-point  Likert  scale).  Next,  we  asked,  “if  you 
were  given  the  choice,  would  you  have  prevented 
the  app  from  accessing  this  data,”  and  to  explain 
why  or  why  not.  Finally,  we  asked  for  permission 
to  view  the  screenshot.  This  phase  of  the  exit  survey 
was  repeated  for  10-15  different  screenshots  per 
participant,  based  on  the  number  of  screenshots 
saved  by  our  reservoir  sampling  algorithm. 

•  Locked  Screens — The  second  part  of  our  survey 
involved  questions  about  the  same  protected  re¬ 
sources,  though  accessed  while  device  screens  were 
off  (i.e.,  participants  were  not  using  their  phones). 
Because  there  were  no  contextual  cues  (i.e.,  screen- 
shots),  we  outright  told  participants  which  appli¬ 
cations  were  accessing  which  resources  and  asked 
them  multiple  choice  questions  about  whether  they 
wanted  to  prevent  this  and  the  degree  to  which  these 
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(a)  On  the  first  screen,  participants  answered  questions  to  estab¬ 
lish  awareness  of  the  permission  request  based  on  the  screenshot. 

o  ft  n  _  r  -  — 

Mm  pM*  m«n. »«  f4ii-iiiw<ir»  wt  ScamMuf  Im  wirr 

>  On  4  iCito  1-t  hoo  nHKhdMvOu*x0Kl  0Ht  Uh» 

I  I  }  4  S  (Mmi  fapMttdi 

4  ■  you  w*r*  yiwn  ih*  chew*.  «MMi4d  yew  Imm  ih*  tnen  MOttiAf  i)ui  cM*’ 

V*»  We 

t  «IWy' 


i  H  e  ekjy  ter  tk*  mterclwn  le  vww  iM  t<re«e>he«> 
ftt  W* 


Three  researchers  independently  coded  423  responses  to 
the  open-ended  question  in  the  screenshot  portion  of  the 
survey.  The  number  of  responses  per  participant  varied, 
as  they  were  randomly  selected  based  on  the  number  of 
screenshots  taken:  participants  who  used  their  phones 
more  heavily  had  more  screenshots,  and  thus  answered 
more  questions.  Prior  to  meeting  to  achieve  consensus, 
the  three  coders  disagreed  on  42  responses,  which  re¬ 
sulted  in  an  inter -rater  agreement  of  90%.  Taking  into 
account  the  9  possible  codings  for  each  response,  Fleiss’ 
kappa  yielded  0.61,  indicating  substantial  agreement. 

4  Application  Behaviors 

Over  the  week-long  period,  we  logged  27M  application 
requests  to  protected  resources  governed  by  Android  per¬ 
missions.  This  translates  to  over  100,000  requests  per 
user/day.  In  this  section,  we  quantify  the  circumstances 
under  which  these  resources  were  accessed.  We  focus  on 
the  rate  at  which  resources  were  accessed  when  partici¬ 
pants  were  not  actively  using  those  applications  (i.e.,  sit¬ 
uations  likely  to  defy  users’  expectations),  access  to  cer¬ 
tain  resources  with  particularly  high  frequency,  and  the 
impact  of  replacing  certain  requests  with  runtime  confir¬ 
mation  dialogs  (as  per  Felt  et  al.’s  suggestion  [14]). 


*  * 


(b)  On  the  second  screen,  they  saw  the  resource  accessed,  stated 
whether  it  was  expected,  and  whether  it  should  have  been  blocked. 

Figure  2:  Exit  Survey  Interface 


behaviors  were  expected.  They  answered  these 
questions  for  up  to  10  requests,  similarly  chosen  by 
our  reservoir  sampling  algorithm  to  yield  a  breadth 
of  application/permission  combinations. 

•  Personal  Privacy  Preferences — ^Finally,  in  order 
to  correlate  survey  responses  with  privacy  prefer¬ 
ences,  participants  completed  two  privacy  scales. 
Because  of  the  numerous  reliability  problems  with 
the  Westin  index  [45],  we  computed  the  average  of 
both  Buchanan  et  al.’s  Privacy  Concerns  Scale 
(PCS)  [10]  and  Malhotra  et  al.’s  Internet  Users’  In¬ 
formation  Privacy  Concerns  (lUIPC)  scale  [31]. 

After  participants  completed  the  exit  survey,  we  re¬ 
entered  the  room,  answered  any  remaining  questions, 
and  then  assisted  them  in  transferring  their  SIM  cards 
back  into  their  personal  phones.  Finally,  we  compen¬ 
sated  each  participant  with  a  $100  gift  card. 


4.1  Invisible  Permission  Requests 

In  many  cases,  it  is  entirely  expected  that  an  application 
might  make  frequent  requests  to  resources  protected  by 
permissions.  For  instance,  the  INTERNET  permission  is 
used  every  time  an  application  needs  to  open  a  socket, 
ACCESS  EINE  UOCATION  is  used  every  time  the 
user’s  location  is  checked  by  a  mapping  application,  and 
so  on.  However,  in  these  cases,  one  expects  users  to  have 
certain  contextual  cues  to  help  them  understand  that  these 
applications  are  running  and  making  these  requests. 
Based  on  our  log  data,  most  requests  occurred  while  par¬ 
ticipants  were  not  actually  interacting  with  those  appli¬ 
cations,  nor  did  they  have  any  cues  to  indicate  that  the 
applications  were  even  running.  When  resources  are  ac¬ 
cessed,  applications  can  be  in  five  different  states,  with 
regard  to  their  visibility  to  users: 

1.  Visible  foreground  application  (12.04%):  the  user 
is  using  the  application  requesting  the  resource. 

2.  Invisible  background  appUcation  (0.70%):  due  to 
multitasking,  the  application  is  in  the  background. 

3.  Visible  background  service  (12.86%):  the  appli¬ 
cation  is  a  background  service,  but  the  user  may  be 
aware  of  its  presence  due  to  other  cues  (e.g.,  it  is 
playing  music  or  is  present  in  the  notificationbar). 

4.  Invisible  background  service  (14.40%):  the  appli¬ 
cation  is  a  background  service  without  visibility. 

5.  Screen  off  (60.00%):  the  application  is  running, 
but  the  phone  screen  is  off  because  it  is  not  inuse. 


APROVED  FOR  PUBLIC  RELEASE;  DISTRIBUTION  UNLIMITED 

35 


Permission 

Requests 

ACCESS  NETWORK _STATE 

31,206 

WAKE_LOCK 

23,816 

ACCESS  FINE  LOCATION 

5,652 

GET  ACCOUNTS 

3,411 

ACCESS  WIFI  STATE 

1,826 

UPDATE"  DEVICE  STATS 

1,426 

ACCESSCOARSETLOCATION 

1,277 

AUTHENTICATE  ACCOUNTS 

644 

READ  SYNC  SETTINGS 

426 

INTERNET  - 

416 

Application 

Requests 

Facebook 

36,346 

Google  Location  Reporting 

31,747 

Facebook  Messenger 

22,008 

Taptu  DJ 

10,662 

Google  Maps 

5,483 

Google  Gapps 

4,472 

Foursquare 

3,527 

Yahoo  Weather 

2,659 

Devexpert  Weather 

2,567 

Tile  Game(Umoni) 

2,239 

Table  2:  The  most  frequently  requested  permissions  by 
applications  with  zero  visibility  to  the  user. 


Combining  the  3. 3M  (12.04%  of  27M)  requests  that  were 
granted  when  the  user  was  actively  using  the  application 
(Category  1)  with  the  3.5M  (12.86%  of  27M)  requests 
that  were  granted  when  the  user  had  other  contextual 
cues  to  indicate  that  the  application  was  running  (Cat¬ 
egory  3),  we  can  see  that  fewer  than  one  quarter  of  all 
permission  requests  (24.90%  of  27M)  occurred  when  the 
user  had  clear  indications  that  those  applications  were 
running.  This  suggests  that  during  the  vast  majority  of 
the  time,  access  to  protected  resources  occurs  opaquely 
to  users.  We  focus  on  these  20. 3M  “invisible”  requests 
(75.10%  of  27M)  in  the  remainder  of  this  subsection. 

Harbach  et  al.  found  that  users’  phone  screens  are  off 
94%  of  the  time  on  average  [22].  We  observed  that  60% 
of  permission  requests  occurred  while  participants’ 
phone  screens  were  off,  which  suggests  that  permission 
requests  occurred  less  frequently  than  when  participants 
were  using  their  phones.  At  the  same  time,  certain  appli¬ 
cations  made  more  requests  when  participants  were  not 
using  their  phones:  “Brave  Frontier  Service,”  “Microsoft 
Sky  Drive,”  and  “Tile  game  by  UMoni.”  Our  study  col¬ 
lected  data  on  over  300  applications,  and  therefore  it  is 
possible  that  with  a  larger  sample  size,  we  would  ob¬ 
serve  other  applications  engaging  in  this  behavior.  All  of 
the  aforementioned  applications  primarily  requested  AC¬ 
CESS  _WIFI  STATE  and  INTERNET.  While  a  definitive 
explanation  for  this  behavior  requires  examining  source 
code  or  the  call  stacks  of  these  applications,  we  hypothe¬ 
size  that  they  were  continuously  updating  local  data  from 
remote  servers.  Eor  instance.  Sky  Drive  may  have  been 
updating  documents,  whereas  the  other  two  applications 
may  have  been  checking  the  status  of  multiplayer  games. 

Table  2  shows  the  most  frequently  requested  permis¬ 
sions  from  applications  running  invisibly  to  the  user  (i.e.. 
Categories  2,  4,  and  5);  Table  3  shows  the  applications 
responsible  for  these  requests  (Appendix  A  lists  the 
permissions  requested  by  these  applications).  We 


Table  3:  The  applications  making  the  most  permission 
requests  while  running  invisibly  to  the  user. 

normalized  the  numbers  to  show  requests  per  user/day. 
ACCESS_NETWORK_STATE  was  most  frequently  re¬ 
quested,  averaging  31,206  times  per  user/day — roughly 
once  every  3  seconds.  This  is  due  to  applications  con¬ 
stantly  checking  for  Internet  connectivity.  However,  the 
5,562  requests/day  to  ACCESS_EINE_LOCATION  and 
1,277  requests/day  to  ACCESS  _COARSEXOCATION 
are  more  concerning,  as  this  could  enable  detailed  track¬ 
ing  of  the  user’s  movement  throughout  the  day.  Sim¬ 
ilarly,  a  user’s  location  can  be  inferred  by  using  AC- 
CESS_WIEI 5TATE  to  get  data  on  nearby  WiEi  SSIDs. 

Contextual  integrity  means  ensuring  that  information 
flows  are  appropriate,  as  determined  by  the  user.  Thus, 
users  need  the  ability  to  see  information  flows.  Current 
mobile  platforms  have  done  some  work  to  let  the  user 
know  about  location  tracking.  Eor  instance,  recent  ver¬ 
sions  of  Android  allow  users  to  see  which  applications 
have  used  location  data  recently.  While  attribution  is  a 
positive  step  towards  contextual  integrity,  attribution  is 
most  beneficial  for  actions  that  are  reversible,  whereas 
the  disclosure  of  location  information  is  not  something 
that  can  be  undone  [14].  We  observed  that  fewer  than  1% 
of  location  requests  were  made  when  the  applications 
were  visible  to  the  user  or  resulted  in  the  displaying  of  a 
GPS  notification  icon.  Given  that  Thompson  et  al. 
showed  that  most  users  do  not  understand  that  appli¬ 
cations  running  in  the  background  may  have  the  same 
abilities  as  applications  running  in  the  foreground  [42], 
it  is  likely  that  in  the  vast  majority  of  cases,  users  do  not 
know  when  their  locations  are  being  disclosed. 

This  low  visibility  rate  is  because  Android  only  shows  a 
notification  icon  when  the  GPS  sensor  is  accessed,  while 
offering  alternative  ways  of  inferring  location.  In  66.1% 
of  applications’  location  requests,  they  directly  queried 
the  TelephonyManager,  which  can  be  used  to  deter¬ 
mine  location  via  cellular  tower  information.  In  33.3% 
of  the  cases,  applications  requested  the  SSIDs  of  nearby 
WiEi  networks.  In  the  remaining  0.6%  of  cases,  applica- 
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tions  accessed  location  information  using  one  of  three 
built-in  location  providers:  GPS,  network,  or  passive. 
Applications  accessed  the  GPS  location  provider  only 
6%  of  the  time  (which  displayed  a  GPS  notification).  In 
the  other  94%  of  the  time,  13%  queried  the  network 
provider  (i.e.,  approximate  location  based  on  nearbycel- 
lular  towers  and  WiFi  SSIDs)  and  81%  queried  the  pas¬ 
sive  location  provider.  The  passive  location  provider 
caches  prior  requests  made  to  either  the  GPS  or  network 
providers.  Thus,  across  all  requests  for  location  data,  the 
GPS  notification  icon  appeared  0.04%  of  the  time. 

While  the  alternatives  to  querying  the  GPS  are  less  ac¬ 
curate,  users  are  still  surprised  by  their  accuracy  [17]. 
This  suggests  a  serious  violation  of  contextual  integrity, 
since  users  likely  have  no  idea  their  locations  are  being 
requested  in  the  vast  majority  of  cases.  Thus,  runtime  no¬ 
tifications  for  location  tracking  need  to  be  improved  [18]. 

Apart  from  these  invisible  location  requests,  we  also  ob¬ 
served  applications  reading  stored  SMS  messages  (125 
times  per  user/day),  reading  browser  history  (5  times  per 
user/day),  and  accessing  the  camera  (once  per  user/day). 
Though  the  use  of  these  permissions  does  not  necessarily 
lead  to  privacy  violations,  users  have  no  contextual  cues 
to  understand  that  these  requests  are  occurring. 

4.2  High  Frequency  Requests 

Some  permission  requests  occurred  so  frequently  that  a 
few  applications  (i.e.,  Facebook,  Facebook  Messenger, 
Google  Location  Reporting,  Google  Maps,  Farm  Heroes 
Saga)  had  to  be  rate  limited  in  our  log  files  (see  Section 
3.1.1),  so  that  the  logs  would  not  fill  up  users’  re¬ 
maining  storage  or  incur  performance  overhead.  Table  4 
shows  the  complete  list  of  application/permission  com¬ 
binations  that  exceeded  the  threshold.  For  instance,  the 
most  frequent  requests  came  from  Facebook  requesting 
ACCESS_NETWORK_STATE  with  an  average  interval 
of  213.88  ms  (i.e.,  almost  5  times  per  second). 

With  the  exception  of  Google’s  applications,  all  rate- 
limited  applications  made  excessive  requests  for  the 
connectivity  state.  We  hypothesize  that  once  these 
applications  lose  connectivity,  they  continuously  poll  the 
system  until  it  is  regained.  Their  use  of  the 
getActiveNetworkInf  o  ( )  method  results  in  permis¬ 
sion  checks  and  returns  Networkinfo  obj  ects,  which  al¬ 
low  them  to  determine  connection  state  (e.g.,  connected, 
disconnected,  etc.)  and  type  (e.g.,  WiEi,  Bluetooth,  cel¬ 
lular,  etc.).  Thus,  these  requests  do  not  appear  to  be  leak¬ 
ing  sensitive  information  per  se,  but  their  frequency  may 
have  adverse  effects  on  performance  and  battery  life.  It 
is  possible  that  using  the  ConnectivityManager’s 
NetworkCallback  method  may  be  able  to  fulfill  this 
need  with  far  fewer  permission  checks. 


Application  /  Permission 

Peak  (ms) 

Avg.  (ms) 

com.facebook.katana 

213.88 

956.97 

ACCESS  NETWORK  STATE 

com.facebook.orca 

334.78 

1146.05 

ACCESS  NETWORK  STATE 

com.google.android.apps.maps 

247.89 

624.61 

ACCESS  NETWORK  STATE 

com.google.process.gapps 

315.31 

315.31 

AUTHENTICATE  ACCOUNTS 

com.google.process.gapps 

898.94 

1400.20 

WAKE  LOCK 

com.google.process. location 

I76.II 

991.46 

WAKE  LOCK 

com.google.process. location 

1387.26 

1387.26 

ACCESS  FINE  LOCATION 

com.google.process. location 

373.41 

1878.88 

GET  ACCOUNTS 

com.google.process. location 

I90I.9I 

I90I.9I 

ACCESS  WIFI  STATE 

com.kingTarmheroessaga 

284.02 

731.27 

ACCESS  NETWORK^TATE 

com.pandora.android 

541.37 

541.37 

ACCESS-NETWORK -STATE 

com.taptu.streams 

1746.36 

1746.36 

ACCESSJ4ETWORKKTATE 

Table  4:  The  application/permission  combinations  that 
needed  to  be  rate  limited  during  the  study.  The  last  two 
columns  show  the  fastest  interval  recorded  and  the  aver¬ 
age  of  all  the  intervals  recorded  before  rate-limiting. 


4.3  F requency  of  Data  Exposure 

Eelt  et  al.  posited  that  while  most  permissions  can  be 
granted  automatically  in  order  to  not  habituate  users  to 
relatively  benign  risks,  certain  requests  should  require 
runtime  consent  [14].  They  advocated  using  runtime  di¬ 
alogs  before  the  following  actions  should  proceed: 

1.  Reading  location  information  (e.g.,  using  conven¬ 
tional  location  APIs,  scanning  WiEi  SSIDs,  etc.). 

2.  Reading  the  user’ s  web  browser  history. 

3.  Reading  saved  SMS  messages. 

4.  Sending  SMS  messages  that  incur  charges,  or  inap¬ 
propriately  spamming  the  user’s  contact  list. 

These  four  actions  are  governed  by  the  12  Android  per¬ 
missions  listed  in  Table  1.  Of  the  300  applications  that 
we  observed  during  the  experiment,  91  (30.3%)  per¬ 
formed  one  of  these  actions.  On  average,  these  permis¬ 
sions  were  requested  213  times  per  hour/user — ^roughly 
every  20  seconds.  However,  permission  checks  occur  un¬ 
der  a  variety  of  circumstances,  only  a  subset  of  which  ex¬ 
pose  sensitive  resources.  As  a  result,  platform  develop- 
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Resource 

VisibI 

Data  Exposed 

e 

Requests 

Invisib 

Data  Exposed 

le 

Requests 

Total 

Data  Exposed 

Requests 

Location 

758 

2,205 

3,881 

8,755 

4,639 

10,960 

Read  SMS  data 

378 

486 

72 

125 

450 

611 

Sending  SMS 

7 

7 

1 

1 

8 

8 

Browser  History 

12 

14 

2 

5 

14 

19 

Total 

1,155 

2,712 

3,956 

8,886 

5,111 

11,598 

Table  5;  The  sensitive  permission  requests  (per  user/day)  when  requesting  applications  were  visible/invisible  to  users. 
“Data  exposed”  reflects  the  subset  of  permission-protected  requests  that  resulted  in  sensitive  data  being  accessed. 


ers  may  decide  to  only  show  runtime  warnings  to  users 
when  protected  data  is  read  or  modified.  Thus,  we  at¬ 
tempted  to  quantify  the  frequency  with  which  permission 
checks  actually  result  in  access  to  sensitive  resources  for 
each  of  these  four  categories.  Table  5  shows  the  number 
of  requests  seen  per  user/day  under  each  of  these  four 
categories,  separating  the  instances  in  which  sensitive 
data  was  exposed  from  the  total  permission  requests  ob¬ 
served.  Unlike  Section  4.1,  we  include  “visible”  permis¬ 
sion  requests  (i.e.,  those  occurring  while  the  user  was  ac¬ 
tively  using  the  application  or  had  other  contextual  infor¬ 
mation  to  indicate  it  was  running).  We  didn’t  observe  any 
uses  of  NFC,  READ  .CALL  LOG,  ADD  VOICEMAIL, 
accessing  WRITE  _SYNC_  SETTINGS  or  INTERNET 
while  roaming  in  our  dataset. 

Of  the  location  permission  checks,  a  majority  were  due 
to  requests  for  location  provider  information  (e.g., 
getBestProvider  ( )  returns  the  best  location 
provider  based  on  application  requirements),  or  check¬ 
ing  WiEi  state  (e.g.,  getWif  iState  ( )  only  reveals 
whether  WiEi  is  enabled).  Only  a  portion  of  the  requests 
actually  exposed  participants’  locations  (e.g., 
getLastKnownLocation  ( )  or  getScanResults ( ) 
exposed  SSIDs  of  nearby  WiEi  networks). 

Although  a  majority  of  requests  for  the  READ.SMS  per¬ 
mission  exposed  content  in  the  SMS  store  (e.g..  Query  ( ) 
reads  the  contents  of  the  SMS  store),  a  considerablepor- 
tion  simply  read  information  about  the  SMS  store  (e.g., 
renewMmsConnectivity  ()  resets  an  applications’ con¬ 
nection  to  the  MMS  store).  An  exception  to  this  is  the  use 
of  SEND  SMS,  which  resulted  in  the  transmission  of  an 
SMS  message  every  time  the  permission  was  requested. 

Regarding  browser  history,  both  accessing  visited  URLs 
(getAllVisitedUrls  ())  and  reorganizing  bookmark 
folders  (addFolderToCurrent  ( ) )  result  in  the  same 
permission  being  checked.  However,  the  latter  does  not 
expose  specific  URLs  to  the  invoking  application. 

Our  analysis  of  the  API  calls  indicated  that  on  average, 
only  half  of  all  permission  checks  granted  applications 
access  to  sensitive  data.  Eor  instance,  across  both  visible 


and  invisible  requests,  5,111  of  the  11,598  (44.3%)  per¬ 
mission  checks  involving  the  12  permissions  in  Table  1 
resulted  in  the  exposure  of  sensitive  data  (Table  5). 

While  limiting  runtime  permission  requests  to  only  the 
cases  in  which  protected  resources  are  exposed  will 
greatly  decrease  the  number  of  user  interruptions,  the  fre¬ 
quency  with  which  these  requests  occur  is  still  too  great. 
Prompting  the  user  on  the  first  request  is  also  not  appro¬ 
priate  (e.g.,  ala  iOS  and  Android  M),  because  our  data 
show  that  in  the  vast  majority  of  cases,  the  user  has  no 
contextual  cues  to  understand  when  protected  resources 
are  being  accessed.  Thus,  a  user  may  grant  a  request  the 
first  time  an  application  asks,  because  it  is  appropriate  in 
that  instance,  but  then  she  may  be  surprised  to  find  that 
the  application  continues  to  access  that  resource  in  other 
contexts  (e.g.,  when  the  application  is  not  actively  used). 
As  a  result,  a  more  intelligent  method  is  needed  to  de¬ 
termine  when  a  given  permission  request  is  likely  to  be 
deemed  appropriate  by  the  user. 

5  User  Expectations  and  Reactions 

To  identify  when  users  might  want  to  be  prompted  about 
permission  requests,  our  exit  survey  focused  on 
participants’  reactions  to  the  12  permissions  in  Table  1, 
limiting  the  number  of  requests  shown  to  each 
participant  based  on  our  reservoir  sampling  algorithm, 
which  was  designed  to  ask  participants  about  a  diverse 
set  of  permission/application  combinations.  We  col¬ 
lected  participants’  reactions  to  673  permission  requests 
(t1  9/participant).  Of  these,  423  included  screenshots 
because  participants  were  actively  using  their  phones 
when  the  requests  were  made,  whereas  250  permission 
requests  were  performed  while  device  screens  wereoff.^ 
Of  the  former,  243  screenshots  were  taken  while  the  re¬ 
questing  application  was  visible  (Category  1  and  3  from 
Section  4.1),  whereas  180  were  taken  while  the  applica¬ 
tion  was  invisible  (Category  2  and  4  from  Section  4.1).  In 
this  section,  we  describe  the  situations  in  which  requests 

^Our  first  1 1  participants  did  not  answer  questions  about  permission 
requests  occurring  while  not  using  their  devices,  and  therefore  the  data 
only  corresponds  to  our  last  25  participants. 
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defied  users’  expectations.  We  present  explanations  for 
why  participants  wanted  to  block  certain  requests,  the 
factors  influencing  those  decisions,  and  how  expectations 
changed  when  devices  were  not  in  use. 

5.1  Reasons  for  Blocking 

When  viewing  screenshots  of  what  they  were  doing 
when  an  application  requested  a  permission,  30  partic¬ 
ipants  (80%  of  36)  stated  that  they  would  have  preferred 
to  block  at  least  one  request,  whereas  6  stated  a  willing¬ 
ness  to  allow  all  requests,  regardless  of  resource  type  or 
application.  Across  the  entire  study,  participants  wanted 
to  block  35%  of  these  423  permission  requests.  When  we 
asked  participants  to  explain  their  rationales  for  these  de¬ 
cisions,  two  main  themes  emerged:  the  request  did  not — 
in  their  minds — ^pertain  to  application  functionality  or  it 
involved  information  they  were  uncomfortable  sharing. 

5.1.1  Relevance  to  Application  Functionality 

When  prompted  for  the  reason  behind  blocking  a  permis¬ 
sion  request,  19  (53%  of  36)  participants  did  not  believe 
it  was  necessary  for  the  application  to  perform  its  task. 
Of  the  149  (35%  of  423)  requests  that  participants  would 
have  preferred  to  block,  79  (53%)  were  perceived  as  be¬ 
ing  irrelevant  to  the  functionality  of  the  application: 

•  “It  wasn’t  doing  anything  that  needed  my  current 
location.  ”  (PI) 

•  “I  don ’t  understand  why  this  app  would  do  anything 
with  SMS.”  (PIO) 

Accordingly,  functionality  was  the  mostcommon  reason 
for  wanting  a  permission  request  to  proceed.  Out  of  the 
274  permissible  requests,  195  (71%  of  274)  were  per¬ 
ceived  as  necessary  for  the  core  functionality  of  the  ap¬ 
plication,  as  noted  by  thirty-one  (86%  of  36)  participants: 

•  “Because  it’s  a  weather  app  and  it  needs  to  know 
where  you  are  to  give  you  weather  informa¬ 
tion.  ”(P13) 

•  “I  think  it  needs  to  read  the  SMS  to  keep  track  of  the 
chat  conversation.  ”(P12) 

Beyond  being  necessary  for  core  functionality,  partici¬ 
pants  wanted  10%  (27  of  274)  of  requests  to  proceedbe- 
cause  they  offered  convenience;  90%  of  these  requests 
were  for  location  data,  and  the  majority  of  those  appli¬ 
cations  were  published  under  the  Weather,  Social,  and 
Travel  &  Local  categories  in  the  Google  Play  store: 

•  “It  selects  the  closest  stop  to  me  so  I  don ’t  have  to 
scroll  through  the  whole  list.  ”  (PO) 

•  “This  app  should  read  my  current  location.  I’d  like 

for  it  to,  so  I  won ’t  have  to  manually  enter  in  my  zip 
code  /  area.  ”  (P4) 


Thus,  requests  were  allowed  when  they  were  expected: 
when  participants  rated  the  extent  to  which  each  request 
was  expected  on  a  5-point  Likert  scale,  allowable  re¬ 
quests  averaged  3.2,  whereas  blocked  requests  averaged 
2.3  (lower  is  less  expected). 

5.1.2  Privacy  Concerns 

Participants  also  wanted  to  deny  permission  requests  that 
involved  data  that  they  considered  sensitive,  regardless 
of  whether  they  believed  the  application  actually  needed 
the  data  to  function.  Nineteen  (53%  of  36)  participants 
noted  privacy  as  a  concern  while  blocking  a  request,  and 
of  the  149  requests  that  participants  wanted  to  block,  49 
(32%  of  149)  requests  were  blocked  for  this  reason: 

•  “SMS  messages  are  quite  personal.  ”  (P14) 

•  “It  is  part  of  a  personal  conversation.  ”  (PI  1) 

•  “Pictures  could  be  very  private  and  I  wouldn  ’t  like 
for  anybody  to  have  access.  ”  (PI 6) 

Conversely,  24  participants  (66%  of  36)  wanted  requests 
to  proceed  simply  because  they  did  not  believe  that  the 
data  involved  was  particularly  sensitive;  this  reasoning 
accounted  for  21%  of  the  274  allowablerequests: 

•  “I’m  ok  with  my  location  being  recorded,  no  con¬ 
cerns.  ”  (P3) 

•  “No  personal  info  being  shared.  ”  (P29) 

5.2  Influential  F  actors 

Based  on  participants’  responses  to  the  423  permission 
requests  involving  screenshots  (i.e.,  requests  occurring 
while  they  were  actively  using  their  phones),  we  quan¬ 
titatively  examined  how  various  factors  influenced  their 
desire  to  block  some  of  these  requests. 

Effects  of  Identifying  Permissions  on  Blocking:  In  the 

exit  survey,  we  asked  participants  to  guess  the  permis¬ 
sion  an  application  was  requesting,  based  on  the  screen- 
shot  of  what  they  were  doing  at  the  time.  The  real  an¬ 
swer  was  among  four  other  incorrect  answers.  Of  the  149 
cases  where  participants  wanted  to  block  permission 
requests,  they  were  only  able  to  correctly  state  what  per¬ 
mission  was  being  requested  24%  of  the  time;  whereas 
when  wanting  a  request  to  proceed,  they  correctly  iden¬ 
tified  the  requested  permission  44%  (120  of  274)  of  the 
time.  However,  Pearson’s  product-moment  test  on  the 
average  number  of  blocked  requests  per  user  and  the  av¬ 
erage  number  of  correct  answers  per  user^  did  not  yield  a 
statistically  significant  correlation  (r=-0.171,  p<0.317). 

Effects  of  Visibility  on  Expectations:  We  were  particu¬ 
larly  interested  in  exploring  if  permission  requests  orig¬ 
inating  from  foreground  applications  (i.e.,  visible  to  the 


^Both  measures  were  normally  distributed. 
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user)  were  more  expected  than  ones  from  background  ap¬ 
plications.  Of  the  243  visible  permission  requests  that  we 
asked  about  in  our  exit  survey,  participants  correctly 
identified  the  requested  permission  44%  of  the  time,  and 
their  average  rating  on  our  expectation  scale  was  3.4.  On 
the  other  hand,  participants  correctly  identified  the  re¬ 
sources  accessed  by  background  applications  only  29% 
of  the  time  (52  of  180),  and  their  average  rating  on  our 
expectation  scale  was  3.0.  A  Wilcoxon  Signed-Rank  test 
with  continuity  correction  revealed  a  statistically  sig¬ 
nificant  difference  in  participants’  expectations  between 
these  two  groups  (V=441.5,  p<0.001). 

Effects  of  Visibility  on  Blocking;  Participants  wanted 
to  block  7 1  (29%  of  243)  permission  requests  originat¬ 
ing  from  applications  running  in  the  foreground,  whereas 
this  increased  by  almost  50%  when  the  applications  were 
in  the  background  invisible  to  them  (43%  of  180).  We 
calculated  the  percentage  of  denials  for  each  partici¬ 
pant,  for  both  visible  and  invisible  requests.  A  Wilcoxon 
Signed-Rank  test  with  continuity  correction  revealed  a 
statistically  significant  difference  (V=58,  p<0.001). 

Effects  of  Privacy  Preferences  on  Blocking:  Partici¬ 
pants  completed  the  Privacy  Concerns  Scale  (PCS)  [10] 
and  the  Internet  Users’  Information  Privacy  Concerns 
(lUIPC)  scale  [31].  A  Spearman’ s  rank  test  yielded  no 
statistically  significant  correlation  between  their  privacy 
preferences  and  their  desire  to  block  permission  requests 
(r=  0.156,  p<0.364). 

Effects  of  Expectations  on  Blocking;  We  examined 
whether  participants’  expectations  surrounding  requests 
correlated  with  their  desire  to  block  them.  For  each  par¬ 
ticipant,  we  calculated  their  average  Likert  scores  for 
their  expectations  and  the  percentage  of  requests  that 
they  wanted  to  block.  Pearson’s  product-moment  test 
showed  a  statistically  significant  correlation  (r=;-0.39, 
p<0.018).  The  negative  correlation  shows  that  partici¬ 
pants  were  more  likely  to  deny  unexpected  requests. 


5.3  User  Inactivity  and  Resource  Access 

In  the  second  part  of  the  exit  survey,  participants  an¬ 
swered  questions  about  10  resource  requests  that  oc¬ 
curred  when  the  screen  was  off  (not  in  use).  Overall, 
they  were  more  likely  to  expect  resource  requests  to  oc¬ 
cur  when  using  their  devices  (jJ  =  3.26  versus  jU  =  2.66). 
They  also  stated  a  willingness  to  block  almost  half  of 
the  permission  requests  (49.6%  of 250)  when  not  in  use, 
compared  to  a  third  of  the  requests  that  occurred  when 
using  their  phones  (35.2%  of  423).  However,  neither  of 
these  differences  was  statisticallysignificant. 


6  Feasibility  of  Runtime  Requests 

Felt  et  al.  posited  that  certain  sensitive  permissions  (Ta¬ 
ble  1)  should  require  runtime  consent  [14],  but  in  Section 
4.3  we  showed  that  the  frequencies  with  which  applica¬ 
tions  are  requesting  these  permissions  make  it  impracti¬ 
cal  to  prompt  the  user  each  time  a  request  occurs.  In¬ 
stead,  the  major  mobile  platforms  have  shifted  towards  a 
model  of  prompting  the  user  the  first  time  an  application 
requests  access  to  certain  resources:  iOS  does  this  for  a 
selected  set  of  resources,  such  as  location  and  contacts, 
and  Android  M  does  this  for  “dangerous”  permissions. 

How  many  prompts  would  users  see,  if  we  added  runtime 
prompts  for  the  first  use  of  these  12  permissions?  We  an¬ 
alyzed  a  scheme  where  a  runtime  prompt  is  displayed  at 
most  once  for  each  unique  triplet  of  (application,  permis¬ 
sion,  application  visibility),  assuming  the  screen  is  on. 
With  a  naive  scheme,  our  study  data  indicates  our  partic¬ 
ipants  would  have  seen  an  average  of  34  runtime  prompts 
(ranging  from  13  to  77,  S=1 1).  As  a  refinement,  we  pro¬ 
pose  that  the  user  should  be  prompted  only  if  sensitive 
data  will  be  exposed  (Section  4.3),  reducing  the  average 
number  of  prompts  to  29. 

Of  these  29  prompts,  21  (72%)  are  related  to  location. 
Apple  iOS  already  prompts  users  when  an  application  ac¬ 
cesses  location  for  the  first  time,  with  no  evidence  of  user 
habituation  or  annoyance.  Focusing  on  the  remaining 
prompts,  we  see  that  our  policy  would  introduce  an  aver¬ 
age  of  8  new  prompts  per  user:  about  5  for  reading  SMS, 
1  for  sending  SMS,  and  2  for  reading  browser  history. 
Our  data  covers  only  the  first  week  of  use,  but  as  we  only 
prompt  on  first  use  of  a  permission,  we  expect  that  the 
number  of  prompts  would  decline  greatly  in  subsequent 
weeks,  suggesting  that  this  policy  would  likely  not  intro¬ 
duce  significant  risk  of  habituation  or  annoyance.  Thus, 
our  results  suggest  adding  runtime  prompts  for  reading 
SMS,  sending  SMS,  and  reading  browser  history  would 
be  useful  given  their  sensitivity  and  low  frequency. 

Our  data  suggests  that  taking  visibility  into  account  is 
important.  If  we  ignore  visibility  and  prompted  only  once 
for  each  pair  of  (application,  permission),  users  would 
have  no  way  to  select  a  different  policy  for  when  the 
application  is  visible  or  not  visible.  In  contrast,  “ask-  on- 
first-use”  for  the  triple  (application,  permission,  visi¬ 
bility)  gives  users  the  option  to  vary  their  decision  based 
on  the  visibility  of  the  requesting  application.  We  evalu¬ 
ated  these  two  policies  by  analyzing  the  exit  survey  data 
(limited  to  situations  where  the  screen  was  on)  for  cases 
where  the  same  user  was  asked  twice  in  the  survey  about 
situations  with  the  same  (application,  permission)  pair  or 
the  same  (application,  permission,  visibility)  triplet,  to 
see  whether  the  user’s  first  decision  to  block  or  not 
matched  their  subsequent  decisions.  For  the  formerpol- 
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icy,  we  saw  only  51.3%  agreement;  for  the  latter,  agree¬ 
ment  increased  to  83.5%.  This  suggests  that  the  (applica¬ 
tion,  permission,  visibility)  triplet  captures  many  of  the 
contextual  factors  that  users  care  about,  and  thus  it  is  rea¬ 
sonable  to  prompt  only  once  per  unique  triplet. 

A  complicating  factor  is  that  applications  can  also  run 
even  when  the  user  is  not  actively  using  the  phone.  In 
addition  to  the  29  prompts  mentioned  above,  our  data 
indicates  applications  would  have  triggered  an  average 
of  7  more  prompts  while  the  user  was  not  actively  using 
the  phone:  6  for  location  and  one  for  reading  SMS.  It  is 
not  clear  how  to  handle  prompts  when  the  user  is  not 
available  to  respond  to  the  prompt:  attribution  might  be 
helpful,  but  further  research  is  needed. 

6.1  Modeling  Users’  Decisions 

We  constructed  several  statistical  models  to  examine 
whether  users’  desire  to  block  certain  permission  re¬ 
quests  could  be  predicted  using  the  contextual  data  that 
we  collected.  If  such  a  relationship  exists,  a  classifier 
could  determine  when  to  deny  potentially  unexpected 
permission  requests  without  user  intervention.  Con¬ 
versely,  the  classifier  could  be  used  to  only  prompt  the 
user  about  questionable  data  requests.  Thus,  the  response 
variable  in  our  models  is  the  user’s  choice  of  whether  to 
block  the  given  permission  request.  Our  predictive  vari¬ 
ables  consisted  of  the  information  that  might  be  available 
at  runtime:  permission  type  (with  the  restriction  that  the 
invoked  function  exposes  data),  requesting  application, 
and  visibility  of  that  application.  We  constructed  sev¬ 
eral  mixed  effects  binary  logistic  regression  models  to 
account  for  both  inter-subject  and  intra-subjecteffects. 

6.1.1  Model  Selection 

In  our  mixed  effects  models,  permission  types  and  the 
visibility  of  the  requesting  application  were  fixed  effects, 
because  all  possible  values  for  each  variable  existed  in 
our  data  set.  Visibility  had  two  values:  visible  (the  user 
is  interacting  with  the  application  or  has  other  contextual 
cues  to  know  that  it  is  running)  and  invisible.  Permission 
types  were  categorized  based  on  Table  5.  The  application 
name  and  the  participant  ID  were  included  as  random  ef¬ 
fects,  because  our  survey  data  did  not  have  an  exhaustive 
list  of  all  possible  applications  a  user  could  run,  and  the 
participant  has  a  non-systematic  effect  on  the  data. 

Table  6  shows  two  goodness-of-fit  metrics:  the  Akaike 
Information  Criterion  (AIC)  and  Bayesian  Information 
Criterion  (BIC).  Lower  values  for  AIC  and  BIC  repre¬ 
sent  better  fit.  Table  6  shows  the  different  parameters 
included  in  each  model.  We  found  no  evidence  of  inter¬ 
action  effects  and  therefore  did  not  include  them.  Visual 
inspection  of  residual  plots  of  each  model  did  not  reveal 
obvious  deviations  from  homoscedasticity  or  normality. 


Predictors 

AIC 

BIC 

Screen  State 

UserCode 

490.60 

498.69 

Screen  On 

Application 

545.98 

554.07 

Screen  On 

Application 

UserCode 

491.86 

503.99 

Screen  On 

Permission 

Application 

UserCode 

494.69 

527.05 

Screen  On 

Visibility 

Application 

UserCode 

481.65 

497.83 

Screen  On 

Permission 

Visibility 

Application 

UserCode 

484.23 

520.64 

Screen  On 

UserCode 

245.13 

252.25 

Screen  Off 

Application 

349.38 

356.50 

Screen  Off 

Application 

UserCode 

238.84 

249.52 

Screen  Off 

Permission 

Application 

UserCode 

235.48 

263.97 

Screen  Off 

Table  6:  Goodness-of-fit  metrics  for  various  mixed  ef¬ 
fects  logistic  regression  models  on  the  exit  survey  data. 


We  initially  included  the  phone’s  screen  state  as  another 
variable.  However,  we  found  that  creating  two  separate 
models  based  on  the  screen  state  resulted  in  better  fit  than 
using  a  single  model  that  accounted  for  screen  state  as  a 
fixed  effect.  When  the  screen  was  on,  the  best  fit  was  a 
model  including  application  visibility  and  appli-  cation 
name,  while  controlling  for  subject  effects.  Here,  fit 
improved  once  permission  type  was  removed  from  the 
model,  which  shows  that  the  decision  to  block  a  permis¬ 
sion  request  was  based  on  contextual  factors:  users  do 
not  categorically  deny  permission  requests  based  solely 
on  the  type  of  resource  being  accessed  (i.e.,  they  also  ac¬ 
count  for  their  trust  in  the  application,  as  well  as  whether 
they  happened  to  be  actively  using  it).  When  the  screen 
was  off,  however,  the  effect  of  permission  type  was  rela¬ 
tively  stronger.  The  strong  subject  effect  in  both  models 
indicates  that  these  decisions  vary  from  one  user  to  the 
next.  As  a  result,  any  classifier  developed  to  automati¬ 
cally  decide  whether  to  block  a  permission  at  runtime  (or 
prompt  the  user)  will  need  to  be  tailored  to  that  particular 
user’s  needs. 

6.1.2  Predicting  User  Reactions 

Using  these  two  models,  we  built  two  classifiers  to  make 
decisions  about  whether  to  block  any  of  the  sensitive  per¬ 
mission  requests  listed  in  Table  5.  We  used  our  exit  sur¬ 
vey  data  as  ground  truth,  and  used  5-fold  cross-validation 
to  evaluate  model  accuracy. 
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We  calculated  the  receiver  operating  characteristic 
(ROC)  to  capture  the  tradeoff  between  true-positive  and 
false-positive  rate.  The  quality  of  the  classifier  can  be 
quantified  with  a  single  value  by  calculating  the  area  un¬ 
der  its  ROC  curve  (AUC)  [23].  The  closer  the  AUC  gets 
to  1.0,  the  better  the  classifier  is.  When  screens  were  on, 
the  AUC  was  0.7,  which  is  40%  better  than  the  random 
baseline  (0.5).  When  screens  were  off,  the  AUC  was  0.8, 
which  is  60%  better  than  a  random  baseline. 

7  Discussion 

During  the  study,  80%  of  our  participants  deemed  at  least 
one  permission  request  as  inappropriate.  This  violates 
Nissenbaum’s  notion  of  “privacy  as  contextual  integrity” 
because  applications  were  performing  actions  that  defied 
users’  expectations  [33].  Felt  et  al.  posited  that  users  may 
be  able  to  better  understand  why  permissionrequests  are 
needed  if  some  of  these  requests  are  granted  via  runtime 
consent  dialogs,  rather  than  Android’s  previous  install¬ 
time  notification  approach  [14].  By  granting  permissions 
at  runtime,  users  will  have  additional  contextual  infor¬ 
mation;  based  on  what  they  were  doing  at  the  time  that 
resources  are  requested,  they  may  have  a  better  idea  of 
why  those  resources  are  being  requested. 

We  make  two  primary  contributions  that  system  design¬ 
ers  can  use  to  make  more  usable  permissions  systems. 
We  show  that  the  visibility  of  the  requesting  application 
and  the  frequency  at  which  requests  occur  are  two 
important  factors  in  designing  a  runtime  consent  plat¬ 
form.  Also,  we  show  that  “prompt-on-first-use”  per 
triplet  could  be  implemented  for  some  sensitive  permis¬ 
sions  without  risking  user  habituation  or  annoyance. 

Based  on  the  frequency  with  which  runtime  permissions 
are  requested  (Section  4),  it  is  infeasible  to  prompt  users 
every  time.  Doing  so  would  overwhelm  them  and  lead  to 
habituation.  At  the  same  time,  drawing  user  attention  to 
the  situations  in  which  users  are  likely  to  be  concerned 
will  lead  to  greater  control  and  awareness.  Thus,  the 
challenge  is  in  acquiring  their  preferences  by  confronting 
them  minimally  and  then  automatically  inferring  when 
users  are  likely  to  find  a  permission  request  unexpected, 
and  only  prompting  them  in  these  cases.  Our  data  sug¬ 
gests  that  participants’  desires  to  block  particular  permis¬ 
sions  were  heavily  influenced  by  two  main  factors:  their 
understanding  of  the  relevance  of  a  permission  request  to 
the  functionality  of  the  requesting  application  and  their 
individual  privacy  concerns. 

Our  models  in  Section  6.1  showed  that  individual  char¬ 
acteristics  greatly  explain  the  variance  between  what  dif¬ 
ferent  users  deem  appropriate,  in  terms  of  access  to  pro¬ 
tected  resources.  While  responses  to  privacy  scales  failed 
to  explain  these  differences,  this  was  not  a  surprise,  as  the 


disconnect  between  stated  privacy  preferences  and  be¬ 
haviors  is  well-documented  (e.g.,  [1]).  This  means  that 
in  order  to  accurately  model  user  preferences,  the  sys¬ 
tem  will  need  to  learn  what  a  specific  user  deems  in¬ 
appropriate  over  time.  Thus,  a  feedback  loop  is  likely 
needed:  when  devices  are  “new,”  users  will  be  required 
to  provide  more  input  surrounding  permission  requests, 
and  then  based  on  their  responses,  they  will  see  fewer 
requests  in  the  future.  Our  data  suggests  that  prompting 
once  for  each  unique  (application,  permission,  applica¬ 
tion  visibility)  triplet  can  serve  as  a  practical  mechanism 
in  acquiring  users’  privacypreferences. 

Beyond  individual  subject  characteristics  (i.e.,  personal 
preferences),  participants  based  their  decisions  to  block 
certain  permission  requests  on  the  specific  applications 
making  the  requests  and  whether  they  had  contextual 
cues  to  indicate  that  the  applications  were  running  (and 
therefore  needed  the  data  to  function).  Future  systems 
could  take  these  factors  into  account  when  deciding 
whether  or  not  to  draw  user  attention  to  a  particular  re¬ 
quest.  For  example,  when  an  application  that  a  user  is  not 
actively  using  requests  access  to  a  protected  resource,  she 
should  be  shown  a  runtime  prompt.  Our  data  indicates 
that,  if  the  user  decides  to  grant  a  request  in  this  situation, 
then  with  probability  0.84  the  same  decision  will  hold  in 
future  situations  where  she  is  actively  using  that  same 
application,  and  therefore  a  subsequent  prompt  may  not 
be  needed.  At  a  minimum,  platforms  need  to  treat 
permission  requests  from  background  applications 
differently  than  those  originating  from  foreground  ap¬ 
plications.  Similarly,  applications  running  in  the  back¬ 
ground  should  use  passive  indicators  to  communicate 
when  they  are  accessing  particular  resources.  Platforms 
can  also  be  designed  to  make  decisions  about  whether  or 
not  access  to  resources  should  be  granted  based  on 
whether  contextual  cues  are  present,  or  at  its  most  basic, 
whether  the  device  screen  is  even  on. 

Finally,  we  built  our  models  and  analyzed  our  data  within 
the  framework  of  what  resources  our  participants  be¬ 
lieved  were  necessary  for  applications  to  correctly  func¬ 
tion.  Obviously,  their  perceptions  may  have  been  incor¬ 
rect:  if  they  better  understood  why  a  particular  resource 
was  necessary,  they  may  have  been  more  permissive. 
Thus,  it  is  incumbent  on  developers  to  adequately  com¬ 
municate  why  particular  resources  are  needed,  as  this  im¬ 
pacts  user  notions  of  contextual  integrity.  Yet,  no  mecha¬ 
nisms  in  Android  exist  for  developers  to  do  this  as  part  of 
the  permission-granting  process.  For  example,  one  could 
imagine  requiring  metadata  to  be  provided  that  explains 
how  each  requested  resource  will  be  used,  and  then  auto¬ 
matically  integrating  this  information  into  permission  re¬ 
quests.  Tan  et  al.  examined  a  similar  feature  on  iOS  that 
allows  developers  to  include  free-form  text  in  runtime 


APROVED  FOR  PUBLIC  RELEASE;  DISTRIBUTION  UNLIMITED 

42 


permission  dialogs  and  observed  that  users  were  more 
likely  to  grant  requests  that  included  this  text  [41].  Thus, 
we  believe  that  including  succinct  explanations  in  these 
requests  would  help  preserve  contextual  integrity  by  pro¬ 
moting  greater  transparency. 

In  conclusion,  we  believe  this  study  was  instructive  in 
showing  the  circumstances  in  which  Android  permission 
requests  are  made  under  real-world  usage.  While  prior 
work  has  already  identified  some  limitations  of  deployed 
mobile  permissions  systems,  we  believe  our  study  can 
benefit  system  designers  by  demonstrating  several  ways 
in  which  contextual  integrity  can  be  improved,  thereby 
empowering  users  to  make  better  security  decisions. 
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A  Invisible  requests 

Following  list  shows  the  set  of  applications  that  have  requested  the  most 

number  of  permissions  while  executing  invisibly  to  the  user  and  the 

most  requested  permission  types  by  each  respective  application. 

-  FacebookApp—  ACCESS  NETWORK  STATE,  ACCESS  FINE 
LOCATION,  ACCESS  WIFI  STATE  ,WAKE  LOCK, 

-  Google  Location— LOCK,  ACCESS  FINE  LOCATION, 
GET  ACCOUNTS,  ACCESS  COARSE  LOCATION, 

-  Facebook  Messenger— ACCESS  NETWORK  STATE,  ACCESS 
WIFI  STATE,  WAKE  LOCK,  READ  PHONE  STATE, 

-  Taptu  DJ— ACCESS  NETWORK  STATE,  INTERNET,  NFC 

-  Google  Maps— ACCESS  NETWORK  STATE,  GET  AC¬ 
COUNTS,  WAKE  LOCK,  ACCESS  FINE  LOCATION, 

-  Google  (Gapps)—'<NAKE  LOCK,  ACCESS  FINE  LOCA¬ 
TION,  AUTHENTICATE  ACCOUNTS,  ACCESS  NETWORK 
STATE, 

-  Fouraquare— ACCESS  WIFI  STATE,  WAKE  LOCK,  ACCESS 
FINE  LOCATION,  INTERNET, 

-  Yahoo  Weather— ACCESS  FINE  LOCATION,  ACCESS  NET¬ 
WORK  STATE,  INTERNET,  ACCESS  WIFI  STATE, 
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•  Devexpert  Weather— ACCESS  NETWORK  STATE,  INTER¬ 
NET,  ACCESS  FINE  LOCATION, 

-  Tile  Game(Umoni}— ACCESS  NETWORK  STATE,  WAKE 
LOCK,  INTERNET,  ACCESS  WIFI  STATE, 

Following  is  the  most  frequently  requested  permission  type  by  appli¬ 
cations  while  running  invisibly  to  the  user  and  the  applications  who 
requested  the  respective  permission  type  most. 

•  ACCESSJCETWORKSTATE —  Facebook  App,  Google  Maps, 
Facebook  Messenger,  Google  (Gapps),  Taptu  -  DJ 

•  WAKEJ.OCK — -Google  (Location),  Google  (Gapps),  Google 
(GMS),  Facebook  App,  GTalk. 

•  ACCESS-FINE_L0CAT10N — Google  (Location),  Google 
(Gapps),  Facebook  App,  Yahoo  Weather,  Rhapsody  (Music) 

•  GET_ACCOUNTS— Google  (Location),  Google  (Gapps), 
Google  (Login),  Google  (GM),  Google  (Vending) 

•  ACCESS JWIFI STATE — Google  (Location),  Google  (Gapps), 
Facebook  App,  Foursqaure,  Facebook  Messenger 

-  UPDATE_DEVICESTATS— Google  (SystemUI),  Google  (Loca¬ 
tion),  Google  (Gapps) 

•  ACCESSCOARSEJ.OCATION — Google  (Location),  Google 
(Gapps),  Google  (News),  Facebook  App,  Google  Maps 

.  AUTHENTICATE  ACCOUNTS— Google  (Gapps),  Google  (Lo¬ 
gin),  Twitter,  Yahoo  Mail,  Google  (GMS) 

-  READ.SYNC.  SETTINGS— Google  (GM),  Google  (  GMS  ),  an¬ 
droid. process. acore,  Google  (Email),  Google  (Gapps) 

•  INTERNET — Google  (Vending),  Google  (Gapps),  Google  (GM), 
Facebook  App,  Google  (Location) 


C  Permission  Type  Breakdown 

This  table  lists  the  most  frequently  used  permissions  during  the  study 
period,  (per  user  /  per  day) 


Permission  Type 

Requests 

ACCESS  NETWORK _STATE 

41077 

WAKE  LOCK 

27030 

ACCESS  EINE  LOCATION 

7400 

GET  ACCOUNTS 

4387 

UPDATE  DEVICE  STATS 

2873 

ACCESS“WIEI  STATE 

2092 

ACCESS“COARSE  LOCATION 

1468 

AUTHENTICATE  ACCOUNTS 

1335 

READ  SYNC  SETTINGS 

836 

VIBRATE  - 

740 

INTERNET 

739 

READSMS 

611 

READ  PHONE  STATE 

345 

STATUS  BAR 

290 

WRITE  SYNC  SETTINGS 

206 

CHANGE  COMPONENT  ENABLED  STATE 

197 

CHANGE^  WIFE  STATE 

168 

READUALENDAR 

166 

ACCOUNT-MANAGER 

134 

ACCESS-ALLDOWNLOADS 

127 

READHXTERNAL  -STORAGE 

126 

USE-CREDENTIALS 

101 

READ  DOGS 

94 

B  Distribution  of  Requests 

The  following  graph  shows  the  distribution  of  requests  throughout  a 
given  day  averaged  across  the  data  set. 


Hour  of  the  day 


D  User  Application  Breakdown 

This  table  shows  the  applications  that  most  frequently  requested  access 
to  protected  resources  during  the  study  period,  (per  user  /  per  day) 


Application  Name 

Requests 

facebook.katana 

40041 

google.process. location 

32426 

facebook.orca 

24702 

taptu.streams 

15188 

google.android.apps.maps 

6501 

google.process. gapps 

5340 

yahoo .  mobile  .client,  android,  weather 

5505 

tumblr 

4251 

king.farmheroessaga 

3862 

j  oelapenna.  foursquared 

3729 

telenav.app.android.scout  us 

3335 

devexpert.  weather 

2909 

ch.bitspin.timely 

2549 

umonistudio.tile 

2478 

king.candycrushsaga 

2448 

android,  systemui 

2376 

bambuna.podcastaddict 

2087 

contapps .  android 

1662 

handcent.nextsms 

1543 

foursquare.robin 

1408 

qisiemoji.inputmethod 

1384 

devian.tubemate.home 

1296 

lookout 

1158 
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Abstract — Current  smartphone  operating  systems  regulate  ap¬ 
plication  permissions  by  prompting  users  on  an  ask-on-first-use 
basis.  Prior  research  has  shown  that  this  method  is  ineffective 
because  it  fails  to  account  for  context:  the  circumstances  under 
which  an  application  first  requests  access  to  data  may  be  vastly 
different  than  the  circumstances  under  which  it  subsequently 
requests  access.  We  performed  a  longitudinal  131-person  field 
study  to  analyze  the  contextuality  behind  user  privacy  decisions  to 
regulate  access  to  sensitive  resources.  We  built  a  classifier  to  make 
privacy  decisions  on  the  user’s  behalf  by  detecting  when  context 
has  changed  and,  when  necessary,  inferring  privacy  preferences 
based  on  the  user’s  past  decisions  and  behavior.  Our  goal  is  to 
automatically  grant  appropriate  resource  requests  without 
further  user  intervention,  deny  inappropriate  requests,  and  only 
prompt  the  user  when  the  system  is  uncertain  of  the  user’s 
preferences.  We  show  that  our  approach  can  accurately  predict 
users’  privacy  decisions  96.8%  of  the  time,  which  is  a  four-fold 
reduction  in  error  rate  compared  to  current  systems. 

I.  Introduction 

One  of  the  roles  of  a  mobile  application  platform  is  to  help 
users  avoid  unexpected  or  unwanted  use  of  their  personal  data 
[12].  Mobile  platforms  currently  use  permission  systems  to 
regulate  access  to  sensitive  resources,  relying  on  user  prompts 
to  determine  whether  a  third-party  application  should  be 
granted  or  denied  access  to  data  and  resources.  One  critical 
caveat  in  this  approach,  however,  is  that  mobile  platforms  seek 
the  consent  of  the  user  the  first  time  a  given  application 
attempts  to  access  a  certain  data  type  and  then  enforce  the  user’s 
decision  for  all  subsequent  cases,  regardless  of  the 
circumstances  surrounding  each  access.  For  example,  a  user 
may  grant  an  application  access  to  location  data  because  she  is 
using  location-based  features,  but  by  doing  this,  the  ap¬ 
plication  can  subsequently  access  location  data  for  behavioral 
advertising,  which  may  violate  the  user’s  preferences. 

Earlier  versions  of  Android  (5.1  and  below)  asked  users  to 
make  privacy  decisions  during  application  installation  as  an  all- 
or-nothing  ultimatum  (ask-on-install):  either  all  requested 
permissions  are  approved  or  the  application  is  not  installed. 
Previous  research  showed  that  few  people  read  the  requested 
permissions  at  install-time  and  even  fewer  correctly  under¬ 
stood  them  [17].  Furthermore,  install-time  permissions  do  not 
present  users  with  the  context  in  which  those  permission  will 


be  exercised,  which  may  cause  users  to  make  suboptimal  de¬ 
cisions  not  aligned  with  their  actual  preferences.  For  example, 
Egelman  et  al.  observed  that  when  an  application  requests 
access  to  location  data  without  providing  context,  users  are  just 
as  likely  to  see  this  as  a  signal  for  desirable  location-based 
features  as  they  are  an  invasion  of  privacy  [11].  Asking  users  to 
make  permission  decisions  at  runtime — at  the  moment  when 
the  permission  will  actually  be  used  by  the  application — 
provides  more  context  (i.e.,  what  they  were  doing  at  the  time 
that  data  was  requested)  [15].  However,  due  to  the  high 
frequency  of  permission  requests,  it  is  not  feasible  to  prompt 
the  user  every  time  data  is  accessed  [43]. 

In  iOS  and  Android  M,  the  user  is  now  prompted  at  runtime 
the  first  time  an  application  attempts  to  access  one  of  a  set  of 
“dangerous”  permission  types  (e.g.,  location,  contacts,  etc.). 
This  ask-on-first-use  (AOFU)  model  is  an  improvement  over 
ask-on-install  (AOI).  Prompting  users  the  first  time  an  applica¬ 
tion  uses  one  of  the  designated  permissions  gives  users  abetter 
sense  of  context:  their  knowledge  of  what  they  were  doing 
when  the  application  first  tried  to  access  the  data  should  help 
them  determine  whether  the  request  is  appropriate.  Despite  that, 
Wijesekera  et  al.  showed  that  AOFU  fails  to  meet  user 
expectations  over  half  the  time.  This  is  because  AOFU  does  not 
account  for  the  varying  contexts  of  future  requests  [43]. 

The  notion  of  contextual  integrity  suggests  that  many  per¬ 
mission  models  fail  to  protect  user  privacy  because  they  fail  to 
account  for  the  context  surrounding  data  flows  [34].  That  is, 
privacy  violations  occur  when  sensitive  resources  are  used  in 
ways  that  defy  users’  expectations.  Weposit  that  more  effective 
permission  models  must  focus  on  whether  resource  accesses  are 
likely  to  defy  users’  expectations  in  a  given  context — not 
simply  whether  the  application  was  authorized  to  receive  data 
the  first  time  it  asked  for  it.  Thus,  the  challenge  for  system 
designers  is  to  correctly  infer  when  the  context  surrounding  a 
data  request  has  changed,  and  whether  the  new  context  is  likely 
to  be  deemed  “appropriate”  or  “inappropriate”  for  the  given 
user.  Dynamically  regulating  data  access  based  on  the  context 
requires  more  user  involvement  to  understand  users’  contextual 
preferences.  If  users  are  asked  to  make  privacy  decisions  too 
frequently,  or  under  circumstances  that  are  seen  as  low-risk, 
they  may  become  habituated  to  future,  more  serious,  privacy 
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decisions.  On  the  other  hand,  if  users  are  asked  to  make  too  few 
privacy  decisions,  they  may  find  that  the  system  has  acted 
against  their  wishes.  Thus,  our  goal  is  to  automatically 
determine  when  and  under  what  circumstances  the  system 
presents  users  with  runtime  prompts. 

To  this  end,  we  collected  real-world  Android  usage  data  in 
order  to  explore  whether  we  could  infer  users’  future  privacy 
decisions  based  on  their  past  privacy  decisions,  contextual 
circumstances  surrounding  applications’  data  requests,  and 
users’  behavioral  traits.  We  conducted  a  field  study  where  131 
participants  used  Android  phones  that  were  instrumented  to 
gather  data  over  an  average  of  32  days  per  participant.  Also, 
their  phones  periodically  prompted  them  to  make  privacy 
decisions  when  applications  used  sensitive  permissions,  and  we 
logged  their  decisions.  Overall,  participants  wanted  to  block 
60%  of  these  requests.  We  found  that  AOFU  yields  84% 
accuracy,  i.e.,  its  policy  agrees  with  participants’  prompted 
responses  84%  of  the  time.  AOI  achieves  only  25%  accuracy. 
We  designed  new  techniques  that  use  machine  learning  to 
automatically  predict  how  users  would  respond  to  prompts,  so 
that  we  can  avoid  prompting  them  in  most  cases,  thereby  re¬ 
ducing  user  burden.  Our  classifier  uses  the  user’s  past  decisions 
in  similar  situations  to  predict  their  response  to  a  particular 
permission  request.  The  classifier  outputs  a  prediction  and  a 
confidence  score;  if  the  classifier  is  sufficiently  confident, 
we  use  its  prediction,  otherwise  we  prompt  the  user  for  their 
decision.  We  also  incorporate  information  about  the  user’s 
behavior  in  other  security  and  privacy  situations  to  make 
inferences  about  their  preferences:  whether  they  have  a  screen 
lock  activated,  how  often  they  visit  HTTPS  websites,  and  so 
on.  We  show  that  our  scheme  achieves  96.8%  accuracy  (a  4 
reduction  in  error  rate  over  AOFU)  with  significantly  less  user 
involvement  than  the  status  quo. 

The  specific  contributions  of  our  work  are  the  following: 

-  We  conducted  the  first  known  large-scale  study  on  quan¬ 
tifying  the  effectiveness  of  ask-on-first-use  permissions. 

-  We  show  that  a  significant  portion  of  the  studied  par¬ 
ticipants  make  contextual  decisions  on  permissions —  the 
foreground  application  and  the  visibility  of  the 
permission-requesting  application  are  strong  cues  partic¬ 
ipants  used  to  make  contextual  decisions. 

-  We  show  how  a  machine-learned  model  can  incorporate 
context  and  better  predict  users’  privacy  decisions. 

-  To  our  knowledge,  we  are  the  first  to  use  passively 
observed  traits  to  infer  future  privacy  decisions  on  a  case- 
by-case  basis  at  runtime. 

11.  Related  Work 

There  is  a  large  body  of  work  demonstrating  that  install-  time 
prompts  fail  because  users  do  not  understand  or  pay  attention 
to  them  [19],  [23],  [42].  When  using  install-time  prompts,  users 
often  do  not  understand  which  permission  types  correspond  to 
which  sensitive  resources  and  are  surprised  by  the  ability  of 
background  applications  to  collect  information  [17],  [22],  [41]. 
Applications  also  transmit  a  large  amount  of  location  or  other 
sensitive  data  to  third  parties  without 


user  consent  [12].  When  possible  risks  associated  with  these 
requests  are  revealed  to  users,  their  concerns  range  from  being 
annoyed  to  wanting  to  seek  retribution  [16]. 

To  mitigate  some  of  these  problems,  systems  have  been 
developed  to  track  information  flows  across  the  Android 
system  [12],  [18],  [24]  or  introduce  finer-grained  permission 
control  into  Android  [2],  [21],  [39],  but  many  of  these  solu¬ 
tions  increase  user  involvement  significantly,  which  can  lead  to 
habituation.  Additionally,  many  of  these  proposals  are  useful 
only  to  the  most-motivated  or  technically  savvy  users.  For 
example,  many  such  systems  require  users  to  configure  com¬ 
plicated  control  panels,  which  many  are  unlikely  to  do  [45]. 
Other  approaches  involve  static  analysis  in  order  to  better 
understand  how  applications  could  request  information  [4],  [8], 
[14],  but  these  say  little  about  how  applications  actually  use 
information.  Dynamic  analysis  improves  upon  this  by  allowing 
users  to  see  how  often  this  information  is  requested  in  real  time 
[12],  [40],  [43],  but  substantial  work  is  likely  needed  to  present 
that  information  to  average  users  in  a  meaningful  way. 
Solutions  that  require  user  interruptions  need  to  also  minimize 
user  intervention  in  order  to  prevent  habituation. 

Other  researchers  have  developed  recommendation  systems 
to  recommend  applications  based  on  users’  privacy  prefer¬ 
ences  [46],  or  detect  privacy  violations  and  suggest  prefer¬ 
ences  based  on  crowdsourcing  [1],  [27],  but  such  approaches 
often  do  not  take  individual  user  differences  into  account 
without  significant  user  intervention.  Systems  have  also  been 
developed  to  predict  what  users  would  share  on  mobile  social 
networks  [7],  which  suggests  that  future  systems  could 
potentially  infer  what  information  users  would  be  willing  to 
share  with  third-party  applications.  By  requiring  users  to  self- 
report  privacy  preferences,  clustering  algorithms  have  been 
used  to  define  user  privacy  profiles  even  in  the  face  of  diverse 
preferences  [26],  [38].  However,  researchers  have  found  that 
the  order  in  which  information  is  requested  has  an  impact  on 
prediction  accuracy  [44],  which  could  mean  that  such  systems 
are  only  likely  to  be  accurate  when  they  examine  actual  user 
behavior  over  time  (as  opposed  to  one-time  self-reports). 

Liu  et  al.  clustered  users  by  privacy  preferences  and  used  ML 
techniques  to  predict  whether  to  allow  or  deny  an  ap¬ 
plication’s  request  for  sensitive  user  data  [29].  Their  dataset, 
however,  was  collected  from  a  set  of  highly  privacy-conscious 
individuals:  those  who  choose  to  install  a  permission-control 
mechanism.  Furthermore,  the  researchers  removed  “conflict¬ 
ing”  user  decisions,  in  which  a  user  chose  to  deny  a  permission 
for  an  application,  and  then  later  chose  to  allow  it.  These 
conflicting  decisions,  however,  do  not  represent  noisy  data. 
They  occur  nearly  50%  of  the  time  in  the  real  world  [43],  and 
accurately  reflect  the  nuances  of  user  privacy  preferences. 
Models  must  therefore  account  for  them.  In  fact,  previous  work 
found  that  users  commonly  reassess  privacy  preferences  after 
usage  [3].  Liu  et  al.  also  expect  users  to  make  10%  of  permis¬ 
sion  decisions  manually,  which,  based  on  field  study  results 
from  Wijesekera  et  al.,  would  result  in  being  prompted  every 
three  minutes  [43].  This  is  obviously  impractical.  Our  goal  is  to 
design  a  system  that  can  automatically  make  decisions  on 
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behalf  of  users,  that  accurately  models  their  preferences,  while 
also  not  over -burdening  them  with  repeated  requests. 

Closely  related  to  this  work,  Liu  et  al.  [28]  performed  a  field 
study  to  measure  the  effectiveness  of  a  Privacy  Assistant  that 
offers  recommendations  to  users  on  privacy  settings  that  they 
could  adopt  based  on  each  user’s  privacy  profile — the  privacy 
assistant  predicts  what  the  user  might  want  based  on  the 
inferred  privacy  profile  and  static  analysis  of  the  third-party 
application.  While  this  approach  increased  user  awareness  on 
resource  usage,  the  recommendations  are  static:  they  do  not 
consider  each  application’s  access  to  sensitive  data  on  a  case- 
by-case  basis.  Such  a  coarse-grained  approach  goes  against 
previous  work  suggesting  that  people  do  want  to  vary  their 
decisions  based  on  contextual  circumstances  [43].  A  blanket 
approval  or  denial  of  a  permission  to  a  given  application  car¬ 
ries  a  considerable  risk  of  privacy  violations  or  loss  of  desired 
functionality.  In  contrast,  our  work  uses  dynamic  analysis  to 
infer  the  appropriateness  of  each  given  request  by  considering 
the  surrounding  contextual  cues  and  how  the  user  has  behaved 
in  similar  situations  in  the  past.  As  with  Liu  et  al.,  their  dataset 
was  also  collected  from  privacy-conscious  and  considerably 
tech-savvy  individuals,  which  may  limit  the  generalization  of 
their  results.  The  field  study  we  conduct  in  our  work  uses  a 
more  representative  sample. 

Nissenbaum’s  theory  of  contextual  integrity  suggests  that 
permission  models  should  focus  on  information  flows  that  are 
likely  to  defy  user  expectations  [34].  There  are  three  main 
components  involved  in  deciding  the  appropriateness  of  a  flow 
[6]:  the  context  in  which  the  resource  request  is  made,  the  role 
played  by  the  requesting  application  under  the  current  context, 
and  the  type  of  resource  being  accessed.  Neither  previous  nor 
currently  deployed  permission  models  take  all  three  factors  into 
account.  This  model  could  be  used  to  improve  permission 
models  by  automatically  granting  access  to  data  when  the 
system  determines  that  it  is  appropriate,  denying  access  when 
it  is  inappropriate,  and  prompting  the  user  only  when  a  decision 
cannot  be  made  automatically,  thereby  reducing  user  burden. 

Access  Control  Gadgets  (ACGs)  were  proposed  as  a  mech¬ 
anism  to  tie  sensitive  resource  access  to  certain  UI  elements 
[32],  [35]-[37].  Authors  posit  that  such  an  approach  will 
increase  user  expectations,  as  a  significant  portion  of  partici¬ 
pants  expected  a  UI  interaction  before  a  sensitive  resource  us¬ 
age,  giving  users  an  implicit  mechanism  to  control  access  and 
increasing  awareness  on  resource  usage.  The  biggest  caveat  in 
this  approach  is  that  tying  a  UI  interaction  to  each  sensitive 
resource  access  is  impossible  in  practice  because  resources  are 
accessed  at  a  high  frequency  [43],  and  because  many  legitimate 
resource  accesses  occur  without  user  initiation  [15]. 
Wijesekera  et  al.  performed  a  field  study  [43]  to  operationalize 
the  notion  of  “context,”  to  allow  an  operating  system  to  dif¬ 
ferentiate  between  appropriate  and  inappropriate  data  requests 
by  a  single  application  for  a  single  data  type.  They  found  that 
users’  decisions  to  allow  a  permission  request  significantly 
correlated  with  that  application’s  visibility.  They  posit  that  this 
visibility  is  a  strong  contextual  cue  that  influences  users’ 


Permission  Type 

Activity 

ACCESS_WIFI_STATE 

View  nearby  SSIDs 

NEC 

Communicate  via  NFC 

READ_HISTORY_BOOKMARKS 

Read  users’  browser  history 

ACCESS_EINE_LOCATION 

Read  GPS  location 

ACCESS_COARSE_LOCATION 

Read  network-inferred  location 
(i.e.,  cell  tower  and/or  WiFi) 

LOCATION_HARDWARE 

Directly  access  GPS  data 

READ_CALL_LOG 

Read  call  history 

ADD_VOICEMAIL 

Read  call  history 

READ_SMS 

Read  sent/received/draft  SMS 

SEND_SMS 

Send  SMS 

*INTERNET 

Access  Internet  when  roaming 

*WRITE_SYNC_SETTINGS 

Change  application  sync 
settings  when  roaming 

TABLE  I 

Felt  et  al.  proposed  granting  a  select  set  of  12  permissions  at 

RUNTIME  SO  that  USERS  HAVE  CONTEXTUAL  INFORMATION  TO  INFER  WHY 
THE  DATA  MIGHT  BE  NEEDED[15].  OuR  INSTRUMENTATION  OMITS  THE  LAST 
TWO  PERMISSION  TYPES  (INTERNET  &  WRITE_SYNC_SETTINGS)  AND  RECORDS 
INFORMATION  ABOUT  THE  OTHER  10. 


responses  to  permission  prompts.  They  also  observed  that 
privacy  decisions  were  highly  nuanced,  demonstrating  that  a 
one-size-fits-all  model  is  unlikely  to  be  sufficient;  a  given 
information  flow  may  be  deemed  appropriate  by  one  user  but 
not  by  another  user.  They  recommended  applying  machine 
learning  in  order  to  infer  individual  users’  privacy  preferences. 
To  achieve  this,  research  is  needed  to  determine  what  factors 
affect  user  privacy  decisions  and  how  to  use  those  factors  to 
make  privacy  decisions  on  the  user’s  behalf.  While  we  cannot 
automatically  capture  everything  involved  in  Nissenbaum’s 
notion  of  context,  we  can  try  to  detect  when  context  has  likely 
changed  (insofar  as  to  decide  whether  a  different  privacy 
decision  should  be  made  for  the  same  application  and  data 
type),  by  seeing  whether  the  circumstances  surrounding  a  data 
request  are  similar  to  previous  requests. 

III.  Methodology 

We  collected  data  from  131  participants  to  understand  what 
factors  could  be  used  to  infer  whether  a  permission  request  is 
likely  to  be  deemed  appropriate  by  the  user. 

Previous  work  by  Felt  et  al.  made  the  argument  that  certain 
permissions  are  appropriate  for  runtime  prompts,  because  they 
protect  sensitive  resources  and  because  viewing  the  prompt  at 
runtime  imparts  additional  contextual  information  about  why 
an  application  might  need  the  permission  [15].  Similarly, 
Thompson  et  al.  showed  that  other  permission  requests  could 
be  replaced  with  audit  mechanisms,  because  they  represent 
either  reversible  changes  or  are  sufficiently  low  risk  to  not 
warrant  habituating  the  user  to  prompts  [41].  We  collected 
information  about  10  of  the  12  permissions  Felt  et  al.  suggest 
are  best-suited  for  runtime  prompts.  We  omitted  INTERNET  and 
WRITE_SYNC_SETTINGS,  because  those  permissions  only 
warrant  runtime  prompts  if  the  user  is  roaming  and  we  did  not 
expect  any  participant  to  be  roaming  during  the  study  period, 
and  focused  on  the  remaining  10  permission  types  (Table  I). 
While  there  are  many  other  sensitive  permissions  beyond  this 
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Resource  Access  Notification 

Google  Play  services  has  accessed 
your  location. 

Given  a  choice,  would  you  have 
allowed  or  denied  this  access? 

DENY  ALLOW 
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November  2015 


Use '  *■' 

•  ;'i|  I 


CANCtL  OK 


Type 

Event  Recorded 

Behavioral 

Instrumentation 

Changing  developer  options 

Opening/Closing  security  settings 

Changing  security  settings 

Enabling/Disabling  NFC 

Changing  location  mode 

Opening/Closing  location  settings 

changing  screen-lock  type 

Use  of  two  factor  authentication 

Log  initial  settings  information 

User  locks  the  screen 

Screen  times  out 

App  locks  the  screen 

Audio  mode  changed 

Enabling/Disabling  speakerphone 

Connecting/Disconnecting  headphones 

Muting  the  phone 

Taking  an  audio  call 

Taking  a  picture  (front-  vs.  rear-facing) 

Visiting  an  HTTPS  link  in  Chrome 

Responding  to  a  notification 

Unlocking  the  phone 

Runtime 

Information 

An  application  changing  the  visibility 

Platform  switches  to  a  new  activity 

Permission 

Requests 

An  app  requests  a  sensitive  permission 

ESM  prompt  for  a  selected  permission 

Fig.  1 .  A  screenshot  of  an  ESM  prompt. 


TABLE  II 

Instrumented  events  that  form  our  feature  set 


set.  Felt  et  al.  concluded  that  the  others  are  best  handled  by 
other  mechanisms  (e.g.,  install-time  prompts,  ACGs,  etc.). 

We  used  the  Experience  Sampling  Method  (ESM)  to  collect 
ground  truth  data  about  users’  privacy  preferences  [20].  ESM 
involves  repeatedly  questioning  participants  in  situ  about  a 
recently  observed  event;  in  this  case,  we  probabilistically  asked 
them  about  an  application’s  recent  access  to  data  on  their 
phone,  and  whether  they  would  have  permitted  it  if  given  the 
choice.  We  treated  participants’  responses  to  these  ESM  probes 
as  our  main  dependent  variable  (Eigure  1). 

We  also  instrumented  participants’  smartphones  to  obtain 
data  about  their  privacy-related  behaviors  and  the  frequency 
with  which  applications  accessed  protected  resources.  The 
instrumentation  required  a  set  of  modifications  to  the  Android 
operating  system  and  flashing  a  custom  Android  version  onto 
participants’  devices.  To  facilitate  such  experiments,  the  Uni¬ 
versity  of  Buffalo  offers  non-affiliated  academic  researchers 
access  to  the  PhoneLab  panel  [33],  which  consists  of  more  than 
200  participants.  All  of  these  participants  had  LG  Nexus  5 
phones  running  Android  5.1.1  and  the  phones  were  periodi¬ 
cally  updated  over-the-air  (OTA)  with  custom  modifications  to 
the  Android  operating  system.  Participants  can  decide  when  to 
install  the  OTA  update,  which  marks  their  entry  into  new  ex¬ 
periments.  During  our  experiment  period,  differentparticipants 
installed  the  OTA  update  with  our  instrumentation  at  different 
times,  thus  we  have  neither  data  on  all  PhoneLab  participants 
nor  data  for  the  entire  period.  Our  OTA  update  was  available  to 
participants  for  a  period  of  six  weeks,  between  Eebruary  2016 
and  March  2016.  At  the  end  of  the  study  period,  we  emailed 
participants  a  link  to  an  exit  survey  to  collect  demographic 


information.  Our  study  received  institutional  review  board  (IRB) 
approval.' 

A.  Instrumentation 

The  goal  of  our  instrumentation  was  to  collect  as  much 
runtime  and  behavioral  data  as  could  be  observed  from  the  An¬ 
droid  platform,  with  minimal  performance  cost.  We  collected 
three  categories  of  data:  behavioral  information,  runtime  infor¬ 
mation,  and  user  decisions.  We  made  no  modifications  to  any 
third-party  application  code;  our  dynamic  analysis  techniques 
could  be  used  on  any  third-party  Android  application. 

Table  11  contains  the  complete  list  of  behavioral  and  runtime 
events  our  instrumentation  recorded.  The  behavioral  data  fell 
under  several  categories,  all  chosen  based  on  several  hypothe¬ 
ses  that  we  had  about  the  types  of  behaviors  that  might  cor¬ 
relate  with  privacy  preferences:  web-browsing  habits,  screen 
locking  behavior,  third-party  application  usage  behavior,  audio 
preferences,  call  habits,  camera  usage  patterns,  and  behavior 
related  to  security  settings.  Eor  example,  we  hypothesized  that 
someone  who  manually  locks  their  device  screen  are  more 
privacy-conscious  than  someone  who  lets  it  time  out. 

We  also  collected  runtime  information  about  the  context  of 
each  permission  request,  including  the  visibility  of  the  request¬ 
ing  application  at  the  time  of  request,  what  the  user  was  doing 
when  the  request  was  made  (i.e.,  the  name  of  the  foreground 
application),  and  the  exact  Android  API  function  invoked  by 
the  application  to  determine  what  information  was  requested. 
The  visibility  of  an  application  reflects  the  extent  to  which  the 

'Approved  by  the  UC  Berkeley  IRB  under  protocol  #2013-02-4992 
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user  was  likely  aware  that  the  application  was  running;  if  the 
application  was  in  the  foreground,  the  user  had  cues  that  the 
application  was  running,  but  if  it  was  in  the  background,  then 
the  user  was  likely  not  aware  that  the  application  was  running 
and  therefore  might  find  the  permission  request  unexpected — 
some  background  services  can  still  be  visible  to  the  user  due  to 
on-screen  notification  or  other  cues  that  could  be  perceptible. 
We  monitored  processes’  memory  priority  levels  to  determine 
the  visibility  of  all  Android  processes.  We  also  collected 
information  about  which  Android  Activity  was  active  in  the 
application.^ 

Once  per  day  we  probabilistically  selected  one  of  these 
permission  requests  and  prompted  the  user  about  them  at 
runtime  (Figure  1).  We  used  weighted  reservoir  sampling  to 
select  a  permission  request  to  prompt  about.  We  weight  the 
combination  of  application,  permission,  visibility  based  on 
their  frequency  of  occurrence  seen  by  the  instrumentation;  the 
most-frequent  combination  has  a  higher  probability  of  being 
shown  to  participants  using  ESM.  We  prompted  participants  a 
maximum  of  three  times  for  each  unique  combination.  We 
tuned  the  wording  of  the  prompt  to  make  it  clear  that  the  request 
had  just  occurred  and  their  response  would  not  affect  the  system 
(a  deny  response  would  not  actually  deny  data).  These 
responses  serve  as  the  ground  truth  for  all  the  analysis 
mentioned  in  the  remainder  of  the  paper. 

The  intuition  behind  using  weighted  reservoir  sampling  is  to 
focus  more  on  the  frequently  occurring  permission  requests 
over  rare  ones.  Common  permission  requests  contribute  most 
to  user  habituation  due  to  their  high  frequency.  Thus,  it  is  more 
important  to  learn  about  user  privacy  decisions  on  highly 
frequent  permission  requests  over  the  rare  ones,  which  might 
not  risk  user  habituation  or  annoyance  (and  the  context  of  rare 
requests  may  be  less  likely  to  change). 

B.  Exit  Survey 

At  the  end  of  our  data  collection  period,  PhoneLab  staff 
emailed  participants  a  link  to  our  online  exit  survey,  which  they 
were  incentivized  to  complete  with  a  raffle  for  two  $100 
Amazon  gift  cards.  The  survey  gathered  demographic  informa¬ 
tion  and  qualitative  information  on  their  privacy  preferences. 
Of  the  203  participants  in  our  experiment,  53  fully  completed 
the  survey,  and  another  14  partially  completed  it.  Of  the  53 
participants  to  fully  complete  the  survey,  21  were  male,  31  were 
female,  and  1  undisclosed.  Participants  ranged  from  20  to  72 
years  of  age  {jj.  =  40.83,  CJ  =  14.32).  Participants  identified 
themselves  as  39.3%  staff,  32.1%  students,  19.6%  faculty,  and 
9%  other.  Only  2 1  %  of  the  survey  respondents  had  an  academic 
qualification  in  STEM,  which  suggests  that  the  sample  is 
unlikely  to  be  biased  towards  tech-savvy  users. 

C.  Summary 

We  collected  data  from  Eebruary  5  to  March  17,  2016. 
PhoneLab  allows  any  participant  to  opt-out  of  an  experiment  at 
any  time.  Thus,  of  the  203  participants  who  installed  our 

^An  Android  Activity  represents  the  application  screen  and  UI  elements 
currently  exposed  to  the  user. 


custom  Android  build,  there  were  131  who  used  it  for  more  than 
20  days.  During  the  study  period,  we  collected  176M  events 
across  all  participants  (3 IK  events  per  participant/day).  Our 
dataset  consists  of  1,686  unique  applications  and  13K  unique 
activities.  Participants  also  responded  to  4,636  prompts  during 
the  study  period.  We  logged  96M  sensitive  permission  requests, 
which  translates  to  roughly  one  sensitive  permission  request 
every  6  seconds  per  participant.  Eor  the  remainder  of  the  paper, 
we  only  consider  the  data  from  the  131  participants  who  used 
the  system  for  at  least  20  days,  which  corresponds  to  4,224 
ESM  prompts. 

Of  the  4,224  prompts,  55.3%  were  in  response  to  AC- 
CESS_WIFI_STATE,  when  trying  to  access  WiFi  SSID  informa¬ 
tion  that  could  be  used  to  infer  the  location  of  the  smartphone; 
21.0%,  17.3%,  5.08%,  0.78%,  and  0.54%  were  from  accessing 
location  directly,  reading  SMS,  sending  SMS,  reading  call 
logs,  and  accessing  browser  history,  respectively.  A  total  of 
137  unique  applications  triggered  prompts  during  the  study 
period.  Of  the  4,224  prompts,  participants  wanted  to  deny 
60.0 1  %  of  them,  and  57.65  %  of  the  prompts  were  shown  when 
the  requesting  application  was  running  in  the  foreground  or 
the  user  had  visual  cues  that  the  application  was  running  (e.g., 
notifications).  A  Wilcoxon  signed  rank  test  with  continuity 
correctionrevealed  a  statistically  significant  difference  in  par¬ 
ticipants’  desire  to  allow  or  deny  a  permission  request  based 
on  the  visibility  of  the  requesting  application  (p  <  0.01  52, 
r=  0.221 ),  which  corroborates  previous  findings  [43]. 

IV.  Types  ofUsers 

We  hypothesized  that  there  may  be  different  types  of  users 
based  on  how  they  want  to  disclose  their  private  information  to 
third  parties.  It  is  imperative  to  identify  these  different  sub¬ 
populations  since  different  permission  models  affect  users 
differently  based  on  their  privacy  preferences;  performance 
numbers  averaged  across  a  user  population  could  be  mislead¬ 
ing  since  different  sub-populations  might  react  differently  to 
the  same  permission  model. 

While  our  study  size  was  too  small  to  effectively  apply 
clustering  techniques  to  generate  classes  of  users,  we  did  find  a 
meaningful  distinction  using  the  denial  rate  (i.e.,  the  percentage 
of  prompts  to  which  users  wanted  to  deny  access).  We 
aggregated  users  by  their  denial  rate  in  10%  increments  and 
examined  how  these  different  participants  considered  the 
surrounding  contextual  circumstances  in  their  decisions. 

We  discovered  that  application  visibility  was  a  significant 
factor  for  users  with  a  denial  rate  of  10-90%,  but  not  for  users 
with  a  denial  rate  of  0-10%  or  90-100%.  We  call  the  former 
group  Contextual,  as  they  seem  to  care  about  the  surrounding 
context  (i.e.,  they  make  nuanced  decisions,  allowing  or  denying 
a  permission  request  based  on  whether  they  had  contextual  cues 
that  indicated  that  the  requesting  application  was  running),  and 
the  latter  group  Defaulters,  because  they  seem  to  simply  always 
allow  or  always  deny  requests,  regardless  of  contextual  cues. 

Defaulters  accounted  for  53%  of  131  participants  and  Con- 
textuals  accounted  for  47%.  A  Wilcoxon  signed-rank  test  with 
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to  this  prompt  is  thereafter  applied  whenever  the  same  ap¬ 
plication  requests  the  same  permission.  As  of  March  2017,  only 
34.1%  of  Android  users  have  Android  Marshmallow  or  a  higher 
version  [10],  and  among  these  Marshmallow  users,  those  who 
upgraded  from  a  previous  version  only  see  runtime  permission 
prompts  for  freshly-installed  applications. 

For  the  remaining  65.9%  of  users,  the  system  policy  is  ask- 
on-install  (AOI),  which  automatically  allows  all  runtime 
permission  requests.  During  the  study  period,  all  of  our  partic¬ 
ipants  had  AOI  running  as  the  default  permission  model.  Be¬ 
cause  all  runtime  permission  requests  are  allowed  in  AOI,  any 
of  our  ESM  prompts  that  the  user  wanted  to  deny  correspond  to 
mispredictions  under  the  AOI  model  (i.e.,  the  AOI  model 
granted  access  to  the  data  against  users’  actual  preferences). 


Fig.  2.  Histogram  of  users  based  on  their  denial  rate.  Defaulters  tended  to 
allow  or  deny  almost  all  requests  without  regard  for  contextual  cues,  whereas 
Contextuals  considered  the  visibility  of  the  requesting  application. 


Policy 

Contextuals 

Defaulters 

Overall 

Prompts 

AOI 

44.11% 

6.00% 

25.00% 

0.00 

AOFU-AP 

64.49% 

93.33% 

84.61% 

12.34 

AOFU-APV 

64.28% 

92.85% 

83.33% 

15.79 

AOFU-AfPV 

66.67% 

98.95% 

84.61% 

16.91 

AOFU-VP 

58.65% 

94.44% 

78.04% 

6.43 

AOFU-VA 

63.39% 

93.75% 

84.21% 

12.24 

AOFU-A 

64.27% 

93.54% 

83.33% 

9.06 

AOFU-P 

57.95% 

95.45% 

82.14% 

3.84 

AOFU-V 

52.27% 

95.34% 

81.48% 

2.00 

TABLE  III 

The  accuracy  and  number  of  dieferent  possible  ask-on-first-  use 

COMBINATIONS.  A:  APPLICATION  REQUESTING  THE  PERMISSION,  P: 

Permission  type  requested,  V :  Visibility  oe  the  application  requesting 
THE  permission,  Ap :  Application  running  in  the  eoreground  when  the 
request  is  made.  AOFU-AP  is  the  policy  used  in  Android 
Marshmallow  i.e.,  asking  (prompting)  the  user  eor  each  unique 
application,  permission  combination.  The  table  also  difeerentiates 

POLICY  NUMBERS  BASED  ON  THE  SUBPOPULATION 
OE  Contextuals,  Defaulters,  AND  ACROSS  ALL  users. 


continuity  correctionrevealed  a  statistically  significant  differ¬ 
ence  in  Contextuals  ’  responses  based  on  requesting  application 
visibility  (p  <  0.01 3,  r  =  0.31 2),  while  for  Defaulters  there 
was  no  statistically  significant  difference  (p=  0.227).  That  is, 
Contextuals  used  visibility  as  a  contextual  cue,  when  deciding 
the  appropriateness  of  a  given  permission  request,  whereas 
Defaulters  did  not  vary  their  decisions  based  on  this  cue. 
Figure  2  shows  the  distribution  of  users  based  on  their  denial 
rate.  Vertical  lines  indicate  the  borders  between  Contextuals 
and  Defaulters. 

In  the  remainder  of  the  paper,  we  use  our  Contextuals- 
Defaulters  categorization  to  measure  how  current  and  pro¬ 
posed  models  affect  these  two  sub-populations,  issues  unique 
to  these  sub-populations,  and  ways  to  address  these  issues. 

V.  Ask-On-First-Use  Permissions 

Ask-on-first-use  (AOFU)  is  the  current  Android  permission 
model,  which  was  first  adopted  in  Android  6.0  (Marshmallow). 
AOFU  prompts  the  user  whenever  an  application  requests  a 
dangerous  permission  for  the  first  time  [9];  the  user’s  response 


Table  III  shows  the  expected  median  accuracy  for  AOI,  as  well 
as  several  other  possible  variants  that  we  discuss  in  this  section. 
The  low  median  accuracy  for  Defaulters  was  due  to  the 
significant  number  of  people  who  simply  denied  most  of  the 
prompts.  The  prompt  count  is  zero  for  AOI  because  it  does  not 
prompt  the  user  during  runtime;  users  are  only  shown 
permission  prompts  at  installation. 

More  users  will  have  AOFU  in  the  future,  as  they  upgrade 
to  Android  6.0  and  beyond.  To  the  best  of  our  knowledge, 
no  prior  work  has  looked  into  quantifying  the  effectiveness  of 
AOFU  systematically;  this  section  presents  analysis  of  AOFU 
based  on  prompt  responses  collected  from  participants  and  cre¬ 
ates  a  baseline  against  which  to  measure  our  system’s  improve¬ 
ment.  We  simulate  how  AOFU  performs  through  our  ESM 
prompt  responses.  Because  AOFU  is  deterministic,  each  user’s 
response  to  the  first  prompt  for  each  application:permission 
combination  tells  us  how  the  AOFU  model  would  respond  for 
subsequent  requests  by  that  same  combination.  For  participants 
who  responded  to  more  than  one  prompt  for  each  combination, 
we  can  quantify  how  often  AOFU  would  have  been  correct  for 
subsequent  requests.  Similarly,  we  also  measure  the  accuracy 
for  other  possible  policies  that  the  platform  could  use  to  decide 
whether  to  prompt  the  user.  For  example,  the  status  quo  is 
for  the  platform  to  prompt  the  user  for  each  new  applica¬ 
tion:  permission  combination,  but  how  would  accuracy(and  the 
number  of  prompts  shown)  change  if  the  policy  were  to  prompt 
on  all  new  combinations  of  application:permission:visibilityl 
Table  III  shows  the  expected  median  accuracy^  for  each  policy 
based  on  participants’  responses.  For  each  policy,  A  represents 
the  application  requesting  the  permission,  P  rep-  resents  the 
requested  permission,  V  represents  the  visibility  of  the 
requesting  application,  and  Ap  represents  the  application 
running  in  the  foreground  when  a  sensitive  permission 
request  was  made.  For  instance,  AOFU-AP  is  the  policy  where 
the  user  will  be  prompted  for  each  new  instance  of  an 
application.-permission  combination,  which  the  Android 
6.0  model  employs.  The  last  column  shows  the  number  of 
runtime  prompts  a  participant  would  see  under  each  policy  over 
the  duration  of  the  study,  if  that  policy  were  to  be 

^The  presented  numbers — except  for  average  prompt  count,  which  was  nor¬ 
mally  distributed — are  median  values,  because  the  distributions  were  skewed. 
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implemented.  Both  AOFU-AP  and  AOFU-ApPV  show  about 
a  4.9-reduction  in  error  rate  compared  to  AOI;  AOFU-ApPV 
would  require  more  prompts  over  AOFU-AP,  though  yields  a 
similar  overall  accuracy  rate.  Moving  forward,  we  focus  our 
analysis  only  on  AOFU-AP  (i.e.,  the  current  standard). 

Instances  where  the  user  wants  to  deny  a  permission  and  the 
policy  instead  allows  it  (false  positives)  are  privacy  violations, 
because  they  expose  more  information  to  the  application  than 
the  user  desires.  Instances  where  the  user  wants  to  allow  a 
permission,  but  the  policy  denies  it  (false  negatives)  are 
functionality  losses.  This  is  because  the  application  is  likely 
to  lose  some  functionality  that  the  user  desired  when  it  is 
incorrectly  denied  a  permission.  Privacy  violations  and  func¬ 
tionality  losses  were  approximately  evenly  split  between  the 
two  categories  for  AOFU-AP:  median  privacy  violations  and 
median  functionality  losses  were  6.6%  and  5.0%,  respectively. 
The  AOFU  policy  works  well  for  Defaulters  because,  by 
definition,  they  tend  to  be  consistent  after  their  initial  responses 
for  each  combination.  In  contrast,  the  decisions  of  Contextual 
vary  due  to  other  factors  beyond  just  therequesting  application 
and  the  requested  permission  type.  Hence,  the  accuracy  of 
AOFU  for  Contextual  is  significantly  lower  than  the  accuracy 
for  Defaulters.  This  distinction  shows  that  learning  privacy 
preferences  for  a  significant  portion  of  users  requires  a  deeper 
understanding  of  factors  affecting  their  decisions,  such  as 
behavioral  tendencies  and  contextual  cues.  As  Table  III  sug¬ 
gests,  superficially  adding  more  contextual  variables  (such  as 
visibility  of  the  requesting  application)  does  not  necessarily 
help  to  increase  the  accuracy  of  the  AOFU  policy. 

The  context  in  which  users  are  prompted  under  AOFU  might 
be  a  factor  affecting  its  ability  to  predict  subsequent  instances. 
In  previous  work  [43],  we  found  that  the  visibility  of  the 
requesting  application  is  a  strong  contextual  cue  users  use  to 
vary  their  decisions.  During  the  study  period,  under  the  AOFU- 
AP  policy,  60%  of  the  prompts  could  have  occurred  when  the 
requesting  application  was  visible  to  the  participant — these 
prompts  had  an  accuracy  of  83.3%  in  predicting  subsequent 
instances.  In  instances  where  participants  were  prompted  when 
the  requesting  application  was  running  invisibly  to  the  user, 
AOFU-AP  had  an  accuracy  of  93.7%  in  predicting  subsequent 
instances.  A  Wilcoxon  signed-ranks  test,  however,  did  not 
reveal  a  statistically  significant  difference  (p<  0.3735). 

Our  estimated  accuracy  numbers  for  AOFU  may  be  inflated 
because  AOFU  in  deployment  (Android  6  and  above)  does  not 
filter  permission  requests  that  do  not  reveal  any  sensitive 
information.  For  example,  an  application  can  request  the 
ACCESS_FINE_L0CATI0N  permission  to  check  whether  the 
phone  has  a  specific  location  provider,  which  does  not  leak 
sensitive  information.  Our  AOFU  simulation  uses  the  invoked 
function  to  determine  if  sensitive  data  was  actually  accessed, 
and  only  prompts  in  those  cases  (in  the  interest  of  avoiding  any 
false  positives),  a  distinction  that  AOFU  in  Android  does  not 
make.  Thus,  an  Android  user  would  see  a  permission  request 


prompt  when  the  application  examines  the  list  of  location 
providers,  and  if  the  permission  is  granted,  would  not  subse¬ 
quently  see  prompts  when  location  data  is  actually  captured. 
Previous  work  found  that  79%  of  first-time  permission  requests 
do  not  reveal  any  sensitive  information  [43],  and  nearly  33.9% 
of  applications  that  request  these  sensitive  permission  types  do 
not  access  sensitive  data  at  all.  The  majority  of  AOFU prompts 
in  Marshmallow  are  therefore  effectively  false  positives,  which 
incorrectly  serve  as  the  basis  for  future  decisions.  Given  this, 
AOFU’s  average  accuracy  is  likely  less  than  the  numbers 
presented  in  Table  III.  We  therefore  consider  our  estimates  of 
AOFU  to  be  an  upper  bound. 

VI.  Learning  Privacy  Preferences 

Table  III  shows  that  a  significant  portion  of  users  (the  47% 
classified  as  Contextual)  make  privacy  decisions  that  depend 
on  factors  other  than  the  application  requesting  the  permission, 
the  permission  requested,  and  the  visibility  of  the  requesting 
application.  To  make  decisions  on  behalf  of  the  user,  we  must 
understand  what  other  factors  affect  their  privacy  decisions.  We 
built  a  machine  learning  model  trained  and  tested  on  our  labeled 
dataset  of  4,224  prompts  collected  from  131  users  over  the 
period  of  42  days.  This  approach  is  equivalent  to  training  a 
model  based  on  runtime  prompts  from  hundreds  of  users  and 
using  it  to  predict  those  users’  future  decisions. 

We  focus  the  scope  of  this  work  by  making  the  following  as¬ 
sumptions.  We  assume  that  the  platform,  i.e.,  the  Android  OS, 
is  trusted  to  manage  and  enforce  permissions  for  applications. 
We  assume  that  applications  must  go  through  the  platform’s 
permission  system  to  gain  access  to  protected  resources.  We 
assume  that  we  are  in  a  non-adversarial  machine-learning 
setting  wherein  the  adversary  does  not  attempt  to  circumvent 
the  machine-learned  classifier  by  exploiting  knowledge  of  its 
decision-making  process — though  we  do  present  a  discussion 
of  this  problem  and  potential  solutions  in  Section  IX. 

A.  Feature  Selection 

Using  the  behavioral,  contextual,  and  aggregate  features 
shown  in  Table  II,  we  constructed  16K  candidate  features, 
formed  by  combinations  of  specific  applications  and  actions. 
We  then  selected  20  features  by  measuring  Gini  importance 
through  random  forests  [30],  significance  testing  for  corre¬ 
lations,  and  singular  value  decomposition  (SVD).  SVD  was 
particularly  helpful  to  address  the  sparsity  and  high  dimension¬ 
ality  issues  caused  by  features  generated  based  on  application 
and  activity  usage.  Table  IV  lists  the  20  features  used  in  the  rest 
of  this  work. 

The  behavioral  features  {E)  that  proved  predictive  relate  to 
browsing  habits,  audio/call  traits,  and  locking  behavior.  All 
behavioral  features  were  normalized  per  day/user  and  were 
scaled  in  the  actual  model.  Features  relating  to  browsing  habits 
included  the  number  of  websites  visited,  the  proportion  of 
HTTPS-secured  links  visited,  the  number  of  downloads. 


greater  median  accuracy  when  examining  De- 
separately,  because  the  distributions  are  skewed,  the 


apd  proportion  of  sites  vjsited  that  requested  location  access. 
Feathres  relating  to  locking  behavior  Included  whether  users 


xtuals 

median  overall  accuracy  is  identical  to  AOFU-AP  when  combining  the  groups.  employed  a  passcode/PIN/pattern,  the  frequency  of  screen 
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Feature 

Group 

Feature 

Type 

Behavioral 

Features 

(B) 

Number  of  times  a  website  is  loaded  to 
the  Chrome  browser. 

Numerical 

Out  of  all  visited  websites,  the  proportion 
of  HTTPS-secured  websites. 

Numerical 

The  number  of  downloads  through  Chrome. 

Numerical 

Proportion  of  websites  requested  location 
through  Chrome. 

Numerical 

Number  of  times  PIN/Password  was  used  to 
unlock  the  screen. 

Numerical 

Amount  of  time  spent  unlocking  the  screen. 

Numerical 

Proportion  of  times  screen  was  timed  out 
instead  of  pressing  the  lock  button. 

Numerical 

Frequency  of  audio  calls. 

Numerical 

Amount  of  time  spent  on  audio  calls. 

Numerical 

Proportion  of  time  spent  on  silent  mode. 

Numerical 

Runtime 

Features 

(Rl) 

Application  visibility  (True/False) 

Categorical 

Permission  type 

Categorical 

User  ID 

Categorical 

Time  of  day  of  permission  request 

Numerical 

Aggregated 

Features 

(A) 

Average  denial  rate  for  (Al) 
application:permission:visibility 

Numerical 

Average  denial  rate  for  (A2) 
applicationFipermission:  visibility 

Numerical 

TABLE  IV 

The  complete  list  of  features  used  in  the  ML  model 

EVALUATION.  ALL  THE  NUMERICAL  VALUES  IN  THE  BEHAVIORAL  GROUP 
ARE  NORMALIZED  PER  DAY.  WE  USE  ONE-HOT  ENCODING  FOR 
CATEGORICAL  VARIABLES.  WE  NORMALIZED  NUMERICAL  VARIABLES  BY 
MAKING  EACH  ONE  A  Z-SCORE  RELATIVE  TO  ITS  OWN  AVERAGE. 


unlocking,  the  proportion  of  times  they  allowed  the  screen  to 
timeout  instead  of  pressing  the  lock  button,  and  the  average 
amount  of  time  spent  unlocking  the  screen.  Features  under  the 
audio  and  call  category  were  the  frequency  of  audio  calls,  the 
amount  of  time  they  spend  on  audio  calls,  and  the  proportion  of 
time  they  spent  on  silent  mode. 

Our  runtime  features  (R1/R2)  include  the  requesting  appli¬ 
cation’s  visibility,  permission  requested,  and  time  of  day  of  the 
request.  Initially,  we  included  the  user  ID  to  account  for  user- 
to-user  variance,  but  as  we  discuss  later,  we  subsequently 
removed  it.  Surprisingly,  the  application  requesting  the  per¬ 
mission  was  not  predictive,  nor  were  other  features  based  on 
the  requesting  application,  such  as  application  popularity. 

Different  users  may  have  different  ways  of  perceiving 
privacy  threats  posed  by  the  same  permission  request.  To 
account  for  this,  the  learning  algorithm  should  be  able  to 
determine  how  each  user  perceives  the  appropriateness  of  a 
given  request  in  order  to  accurately  predict  future  decisions.  To 
quantify  the  difference  between  users  in  how  they  perceive  the 
threat  posed  by  the  same  set  of  permission  requests,  we 
introduced  a  set  of  aggregate  features  that  could  be  measured 
at  runtime  and  that  may  partly  capture  users’  privacy 
preferences.  We  compute  the  average  denial  rate  for  each 
unique  combination  of  application:permission:visibility  (Al) 
and  of  applicationp  ^:permission:visibility  (A2).  These 
aggregate  features  indicate  how  the  user  responded  to  previous 
prompts  associated  with  that  combination.  As  expected,  after 

^The  application  running  in  the  foreground  when  the  permission  is  re¬ 
quested  by  another  application. 


Feature  Set 

Contextuals 

Defaulters 

Overall 

Rl 

69.30% 

95.80% 

83.71% 

R2  -l-B 

69.48% 

95.92% 

83.93% 

R2  -l-A 

75.45% 

99.20% 

92.24% 

TABLE  V 

The  median  accuracy  of  the  machine  learning  model  for 

DIFFERENT  FEATURE  GROUPS  ACROSS  DIFFERENT  SUB  POPULATIONS. 


we  introduced  the  aggregate  features,  the  relative  importance  of 
the  user  ID  variable  diminished  and  so  we  removed  it  (i.e.,  users 
no  longer  needed  to  be  uniquely  identified).  We  define  R2  as 
R1  without  the  user  ID. 

B.  Inference  Based  on  Behavior 

One  of  our  main  hypotheses  is  that  passively  observing 
users’  behaviors  helps  infer  users’  future  privacy  decisions.  To 
this  end,  we  instrumented  Android  to  collect  a  wide  array  of 
behavioral  data,  listed  in  Table  II.  We  categorize  our  behavioral 
instrumentation  into  interaction  with  Android  privacy/security 
settings,  locking  behavior,  audio  settings  and  call  habits,  web¬ 
browsing  habits,  and  application  usage  habits.  After  the  feature 
selection  process  (§VI-A),  we  found  that  only  locking 
behavior,  audio  habits,  and  web-browsing  habits  correlated 
with  privacy  behaviors.  Appendix  B  contains  more  information 
on  feature  importance.  All  the  numerical  values  under  the 
behavioral  group  were  normalized  per  day. 

We  trained  an  SVM  model  with  an  RBF  kernel  on  only  the 
behavioral  and  runtime  features  listed  in  Table  IV,  excluding 
user  ID.  The  5-fold  cross-validation  accuracy  (with  random 
splitting)  was  83%  across  all  users.  This  first  setup  assumes  we 
have  prior  knowledge  of  previous  privacy  decisions  to  a  certain 
extent  from  each  user  before  inferring  their  future  privacy 
decisions,  so  it  is  primarily  relevant  after  the  user  has  been 
using  their  phone  for  a  while.  However,  the  biggest  advantage 
of  using  behavioral  data  is  that  it  can  be  observed  passively 
without  any  active  user  involvement  (i.e.,  no  prompting). 

We  use  leave-one-out  cross  validation  to  measure  the  extent 
to  which  we  can  infer  user  privacy  decisions  with  absolutely  no 
user  involvement  (and  without  any  prior  data  on  a  user).  In  this 
second  setup,  when  a  new  user  starts  using  a  smartphone,  we 
assume  there  is  a  ML  model  which  is  already  trained  with 
behavioral  data  and  privacy  decisions  collected  from  a  selected 
set  of  other  users.  We  then  measured  the  efficacy  of  such  a 
model  to  predict  the  privacy  decisions  of  a  new  user,  purely 
based  on  passively  observed  behavior  and  runtime  information 
on  the  request,  without  ever  prompting  that  new  user.  This  is  an 
even  stricter  lower  bound  on  user  involvement,  which 
essentially  mandates  that  a  user  has  to  make  no  effort  to 
indicate  privacy  preferences,  something  that  no  system 
currently  does. 

We  performed  leave-one-out  cross  validation  for  each  of  our 
131  participants,  meaning  we  predicted  a  single  user’s  privacy 
decisions  using  a  model  trained  using  the  data  from  the  other 
130  users’  privacy  decisions  and  behavioral  data.  The  only 
input  for  each  test  user  was  the  passively  observed 
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behavioral  data  and  runtime  data  surrounding  each  request.  The 
model  yielded  a  median  accuracy  of  75%,  which  is  a  4 
improvement  over  AOI.  Furthermore,  AOI  requires  users  to 
make  active  decisions  during  the  installation  of  an  application, 
which  our  second  model  does  not  require. 

Examining  only  behavioral  data  with  leave-one-group-out 
cross  validation  yielded  a  median  accuracy  of  56%  for  Contex- 
tuals,  while  for  Defaulters  it  was  93.01%.  Although,  prediction 
using  solely  behavioral  data  fell  short  of  AOFU-AP  for  Con- 
textuals,  it  yielded  a  similar  median  accuracy  for  Defaulters', 
AOFU-AP  required  12  prompts  to  reach  this  level  of  accuracy, 
whereas  our  model  would  not  have  resulted  in  any  prompts. 
This  relative  success  presents  the  significant  observation  that 
behavioral  features,  observed  passively  without  user  involve¬ 
ment,  are  useful  in  learning  user  privacy  preferences.  This 
provides  the  potential  to  open  entirely  new  avenues  of  user 
learning  and  reduce  the  risk  of  habituation. 

C.  Inference  Based  on  Contextual  Cues 

Our  SVM  model  with  an  RBF  kernel  produced  the  best 
accuracy.  The  results  in  the  remainder  of  this  section  are  trained 
and  tested  with  five-fold  cross  validation  with  random  splitting 
for  an  SVM  model  with  an  RBF  kernel  using  the  ksvm  library 
in  R.  In  all  instances,  the  training  set  was  bootstrapped  with  an 
equal  number  of  allow  and  deny  data  points  to  avoid  training  a 
biased  model.  For  each  feature  group,  all  hyperparameters  were 
tuned  through  grid  search  to  achieve  highest  accuracy.  We  used 
one-hot  encoding  for  categorical  variables.  We  normalized 
numerical  variables  by  making  each  one  a  z-score  relative  to  its 
own  average.  Table  V  shows  how  the  median  accuracy  changes 
with  different  feature  groups.  As  a  minor  note,  the  addition  of 
the  mentioned  behavioral  features  to  runtime  features 
performed  only  marginally  better;  this  could  be  due  to  the  fact 
that  those  two  groups  do  not  complement  each  other  in 
predictions.  In  this  setup,  we  assume  that  there  is  a  single 
model  across  all  the  users  of  Android. 

By  incorporating  user  involvement  in  the  form  of  prompts, 
we  can  use  our  aggregate  features  to  increase  the  accuracy  for 
Contextual,  slightly  less  so  for  Defaulters.  The  aggregate 
features  primarily  capture  how  consistent  users  are  for  particu¬ 
lar  combinations  (i.e.,  application:permission:visibility,  appli- 
cationF:permission:visibility),  which  greatly  affects  accuracy 
for  Contextual.  Defaulters  have  high  accuracy  with  just  run¬ 
time  features  (Rl),  as  they  are  likely  to  stick  with  a  default 
allow  or  deny  policy  regardless  of  the  context  surrounding  a 
permission.  Thus,  even  without  any  aggregate  features  (which 
do  not  impart  any  new  information  about  this  type  of  user),  the 
model  can  predict  privacy  preferences  of  Defaulters  with  a  high 
degree  of  accuracy.  On  the  other  hand.  Contextual  are  more 
likely  to  vary  their  decision  for  a  given  permission  request. 
However,  as  the  accuracy  numbers  in  Table  V  suggest,  this 
variance  is  correlated  with  some  contextual  cues.  The  high 
predictive  power  of  aggregate  features  indicates  that  they  may 
be  capturing  the  contextual  cues,  used  by  Contextual  to  make 
decisions,  to  a  greater  extent. 


The  fact  that  both  application:permission:visibility  and 
applicationF.'permission.'visibility  are  highly  predictive  (Ap¬ 
pendix  A)  indicates  that  user  responses  for  these  combinations 
are  consistent.  The  high  consistency  could  relate  to  the  notion 
that  the  visibility  and  the  foreground  application  (applicationp 
are  strong  contextual  cues  people  use  to  make  their  privacy 
decisions;  the  only  previously  studied  contextual  cue  was  the 
visibility  of  the  application  requesting  the  sensitive  data  [43]. 
We  offer  a  hypothesis  for  why  foreground  application  could  be 
significant:  the  sensitivity  of  the  foreground  application  (i.e., 
high-sensitivity  applications  like  banking,  low-sensitivity 
applications  like  games)  might  impact  how  users  perceive 
threats  posed  by  requests.  Irrespective  of  the  application 
requesting  the  data,  users  may  be  likely  to  deny  the  request 
because  of  the  elevated  sense  of  risk.  We  discuss  this  further  in 
§IX. 

The  model  trained  on  feature  sets  R2,  Al,  and  A2  had  the  best 
accuracy  (and  the  fewest  privacy  violations).  For  the  remainder 
of  the  paper,  we  will  refer  to  this  model  unless  otherwise  noted. 
We  now  compare  AOFU-AP  (the  status  quo  as  of  Android  6.0 
and  above,  presented  in  Table  III)  and  our  model  (Table  V). 
Across  all  users,  our  model  reduced  the  error  rate  from  15.38% 
to  7.76%,  nearly  a  two-fold  improvement. 

Mispredictions  (errors)  in  the  MF  model  were  split  between 
privacy  violations  and  functionality  losses  (54%  and  46%). 
Deciding  which  error  type  is  more  acceptable  is  subjective  and 
depends  on  factors  like  the  usability  issues  surrounding 
functionality  losses  and  gravity  of  privacy  violations.  However, 
the  (approximately)  even  split  between  the  two  error  types 
shows  that  the  MF  is  not  biased  towards  one  particular  deci¬ 
sion  (denying  vs.  allowing  a  request).  Furthermore,  the  area 
under  the  ROC  curve  (AUC),  a  metric  used  to  measure  the 
fairness  of  a  classifier,  is  also  significantly  better  in  the  MF 
model  (0.936  as  opposed  to  0.796  for  AOFU).  This  indicates 
that  the  MF  model  is  equally  good  at  predicting  when  to  both 
allow  and  deny  a  permission  request,  while  AOFU  tends  to  lean 
more  towards  one  decision.  In  particular,  with  the  AOFU 
policy,  users  would  experience  privacy  violations  for  10.01% 
of  decisions,  compared  to  just  4.2%  with  the  MF  model. 
Privacy  violations  are  likely  more  costly  to  the  user  than 
functionality  loss:  denied  data  can  always  be  granted  at  a  later 
time,  but  disclosed  data  cannot  be  taken  back. 

While  increasing  the  number  of  prompts  improves  classifier 
accuracy,  it  plateaus  after  reaching  its  maximum  accuracy,  at  a 
point  we  call  the  steady  state.  For  some  users,  the  classifier 
might  not  be  able  to  infer  their  privacy  preferences  effectively, 
regardless  of  the  number  of  prompts.  As  a  metric  to  measure 
the  effectiveness  of  the  MF  model,  we  measure  the  confidence 
of  the  model  in  the  decisions  it  makes,  based  on  prediction 
class  probabilities.^  In  cases  where  the  confidence  of  the  model 

®Even  when  the  requesting  application  is  running  visible  to  the  user,  the 
foreground  application  could  still  be  different  from  the  requesting  application 
since  the  only  visible  cue  of  the  requesting  application  could  be  a  notification 
in  the  notification  bar. 

^To  calculate  the  class  probabilities,  we  used  the  KSVM  library  in  R.  It 
employs  a  technique  proposed  by  Platt  et  al.  [25]  to  produce  a  numerical  value 
for  each  class’s  probability. 
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is  below  a  certain  threshold,  the  system  should  use  a  runtime 
prompt  to  ask  the  user  to  make  an  explicit  decision.  Thus,  we 
looked  into  the  prevalence  of  low-confidence  predictions 
among  the  current  predictions.  With  a  95%  confidence  inter¬ 
val,  on  average  across  five  folds,  low-confidence  predictions 
accounted  for  less  than  10%  of  all  predictions.  The  remaining 
high-confidence  predictions  (90%  of  all  predictions)  had  an 
average  accuracy  of  96.2%,  whereas  predictions  with  low 
confidence  were  only  predicted  with  an  average  accuracy  of 
72%.  §VII-B  goes  into  this  aspect  in  detail  and  estimates  the 
rate  at  which  users  will  see  prompts  in  steady  state. 

The  caveat  in  our  ML  model  is  that  AOFU-AP  only  resulted 
in  12  prompts  on  average  per  user  during  the  study,  while  our 
model  averaged  24.  The  increased  prompting  stems  from 
multiple  prompts  for  the  same  combination  of  appli- 
cation:permission:visibility,  whereas  in  AOFU,  prompts  are 
shown  only  once  for  each  application:permission  combination. 
During  the  study  period,  users  on  average  saw  2.28  prompts  per 
unique  combination.  While  multiple  prompts  per  combination 
help  the  ML  model  to  capture  user  preferences  under  different 
contextual  circumstances,  it  risks  habituation,  which  may 
eventually  reduce  the  reliability  of  the  user  responses. 

The  evaluation  setup  mentioned  in  the  current  section  does 
not  have  a  specific  strategy  to  select  the  training  set.  It 
randomly  splits  the  data  set  into  the  5  folds  and  picks  4  out  of  5 
as  the  training  set.  In  a  real-world  setup,  the  platform  needs  a 
strategy  to  carefully  select  the  training  set  so  that  the  platform 
can  learn  most  of  the  user’s  privacy  preferences  with  a 
minimum  number  of  prompts.  The  next  section  presents  an  in- 
depth  analysis  on  possible  ways  to  reduce  the  number  of 
prompts  needed  to  train  the  ML  model. 

VII.  Learning  Strategy 

This  sections  presents  a  strategy  the  platform  can  follow  in 
the  learning  phase  of  a  new  user.  The  key  objective  of  the 
learning  strategy  should  be  to  learn  the  user’s  privacy 
preferences  with  minimal  user  involvement  (prompts).  Once 
the  model  reaches  adequate  training,  we  can  use  model  deci¬ 
sion  confidence  to  analyze  how  the  ML  model  performs  for 
different  users  and  examine  the  tradeoff  between  user  involve¬ 
ment  and  accuracy.  We  also  utilize  the  model’s  confidence  on 
decisions  to  present  a  strategy  that  can  further  reduce  model 
error  through  selective  permission  prompting. 

A.  Bootstrapping 

The  bootstrapping  phase  occurs  when  the  ML  model  is 
presented  with  a  new  user  about  whom  the  model  has  no  prior 
information.  In  this  section,  we  analyze  how  the  accuracy 
improves  as  we  prompt  the  user.  Since  the  model  presented  in 
§VI  is  a  single  model  trained  with  data  from  all  users,  the  ML 
model  can  still  predict  a  new  user’s  privacy  decisions  by 
leveraging  the  data  collected  on  other  users’  preferences. 

We  measured  the  accuracy  of  the  ML  model  as  if  it  had  to 
predict  each  user’s  prompt  responses  using  a  model  trained 
using  other  users’  data.  Formally,  this  is  called  leave-one-out 
cross-validation,  where  we  remove  all  the  prompt  responses 


from  a  single  user.  The  training  set  contains  all  the  prompt  re¬ 
sponses  from  130  users  and  the  test  set  is  the  prompt  responses 
collected  from  the  single  remaining  user.  The  model  had  a 
median  accuracy  of  66.6%  (56.2%  for  Contextual,  86.4%  for 
Defaulters).  Although  this  approach  does  not  prompt  new 
users,  it  falls  short  of  AOFU.  This  no-prompt  model  behaves 
close  to  random  guessing  for  Contextual  and  significantly 
better  for  Defaulters.  Furthermore,  Wijesekera  et  al.  found  that 
individuals’  privacy  preferences  varied  a  lot  [43],  suggesting 
that  utilizing  other  users’  decisions  to  predict  decisions  for  a 
new  user  has  limited  effectiveness,  especially  for  Contextuals', 
some  level  of  prompting  is  necessary. 

There  are  a  few  interesting  avenues  to  explore  when 
determining  the  optimal  way  to  prompt  the  user  in  the  learning 
phase.  One  option  would  be  to  follow  the  same  weighted- 
reservoir  sampling  algorithm  mentioned  in  §III-A.  The 
algorithm  is  weighted  by  the  frequency  of  each  appli- 
cation:permission:visibility  combination.  The  most  frequent 
combination  will  have  the  highest  probability  of  creating  a 
permission  prompt  and  after  the  given  combination  reaches  a 
maximum  of  three  prompts,  the  algorithm  will  no  longer 
consider  that  combination  for  prompting,  giving  the  second 
most  frequent  combination  the  new  highest  probability.  Due  to 
frequency-weighting  and  multiple  prompts  per  combination, 
the  weighted-reservoir  sampling  approach  requires  more 
prompts  to  cover  a  broader  set  of  combinations.  However, 
AOFU  prompts  only  once  per  combination  without  frequency¬ 
weighting.  This  may  be  a  useful  strategy  initially  for  a  new  user 
since  it  allows  the  platform  to  learn  about  the  users’  privacy 
preferences  for  a  wide  array  of  combinations  with  minimal  user 
interaction. 

To  simulate  such  an  approach,  we  extend  the  aforemen¬ 
tioned  no-prompt  model  (leave-one-out  validation).  In  the  no¬ 
prompt  model,  there  was  no  overlap  of  users  in  the  train  and 
test  set.  In  the  new  approach,  the  training  set  includes  the  data 
from  other  users  as  well  as  the  new  user’s  responses  to  the  first 
occurrence  of  each  unique  combination  of  applica- 
tion:permission:visibility.  The  first  occurrence  of  each  unique 
combination  simulates  the  AOFU-APV  policy.  That  is,  this 
model  is  bootstrapped  using  data  from  other  users  and  then 
adopts  the  AOFU-APV  policy  to  further  learn  the  current  user’s 
preferences.  The  experiment  was  conducted  using  the  same  set 
of  features  mentioned  in  §VI-A  (R2  +  Al  +  A2  and  an  SVM 
with  a  RBF  kernel).  The  test  set  only  contained  prompt 
responses  collected  after  the  last  AOFU  prompt  to  ensure 
chronological  consistency. 

Figure  3  shows  how  accuracy  changes  with  the  varying 
number  of  AOFU  prompts  for  Contextual  and  Defaulters.  For 
each  of  the  131  users,  we  ran  the  experiment  varying  the  AOFU 
prompts  from  1  to  12.  We  chose  this  upper  bound  because,  on 
average,  a  participant  saw  12  different  unique  ap- 
plication.-permission  combinations  during  the  study  period — 
the  current  permission  model  in  Android.  AOFU  relies  on  user 
prompts  for  each  new  combination.  The  proposed  ML  model, 
however,  has  the  advantage  of  leveraging  data  collected  from 
other  users  to  predict  a  combination  not  seen  by  the  user;  it  can 
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find  a  way  to  differentiate  between  Defaulters  and  Contextuals 
early  in  the  bootstrapping  phase  to  determine  which  users 
^  ^  require  fewer  prompts.  The  analysis  of  our  hybrid  approach 

09  ^  c  ‘  ‘  addresses  the  concern  of  a  high  number  of  permission  prompts 

‘  ‘  initially  for  an  ML  approach.  Over  time,  accuracy  can  always 

Population  be  improved  with  more  prompts. 

I  '  .  ^  +  conteauai.  Qut  new  hybrid  approach  of  using  AOFU-style  permission 

<  X  Defaulters  pj-Qmpts  in  the  bootstrapping  phase  to  train  our  model  can 

...  °  achieve  higher  accuracy  than  AOFU,  with  significantly  fewer 

prompts.  Having  a  learning  strategy  (use  of  AOFU)  over  ran- 
•  dom  selection  helped  to  minimize  user  involvement  (24  vs.  9) 

while  significantly  reducing  the  error  rate  (7.6%  vs.  3.2%)  over 
.  a  random  selection  of  the  training  set. 

1  2  3  4  5  6  7  8  9  10  11  12 

Prompt  Count 


Fig.  3.  How  the  median  accuracy  varies  with  the  number  of  seen  prompts 

significantly  reduce  user  involvement  in  the  learning  phase. 
After  12  prompts,  accuracy  reached  96.8%  across  all  users. 

Each  new  user  starts  off  with  a  single  model  shared  by  all 
new  users  and  then  moves  onto  a  separate  model  trained  with 
AOFU  prompt  responses.  We  analyze  its  performance  for 
Defaulters  and  Contextuals  separately,  finding  that  it  improves 
accuracy  while  reducing  user  involvement  in  both  cases,  com¬ 
pared  to  the  status  quo. 

We  first  examine  how  our  model  performs  for  Defaulters, 
53%  of  our  sample.  Figure  3  shows  that  our  model  trained  with 
AOFU  permission-prompt  responses  outperforms  AOFU  from 
the  very  beginning.  The  model  starts  off  with  96.6%  accuracy 
(before  it  reaches  close  to  100%  after  6  prompts),  handily 
exceeding  AOFU’s  93.33%.  This  is  a  83.3%  reduction  in 
permission  prompts  compared  to  AOFU-AP  (the  status  quo). 
Even  with  such  a  significant  reduction  in  user  involvement,  the 
new  approach  cuts  the  prediction  error  rate  in  half. 

Contextuals  needed  more  prompts  to  outperform  the  AOFU 
policy;  the  hybrid  approach  matches  AOFU-AP  with  just  7 
prompts,  a  42%  reduction  in  prompts.  With  12  permission 
prompts,  same  as  needed  for  AOFU-AP,  the  new  approach  had 
reduced  the  error  rate  by  43%  over  AOFU-AP  (the  status  quo). 
The  number  of  prompts  needed  to  reach  this  level  of  accuracy 
in  the  new  approach  is  25%  less  than  what  is  needed  for  AOFU- 
APV.  We  also  observed  that  as  the  number  of  prompts 
increased,  the  AUC  of  our  predictions  also  similarly  increased. 
Overall,  the  proposed  learning  strategy  reduced  the  error  rate 
by  80%  after  12  user  prompts  over  AOFU-AP.  Given, 
Defaulters  plateau  early  in  their  learning  cycle  (after  only  6 
prompts),  the  proposed  learning  strategy,  on  average,  needs  9 
prompts  to  reach  its  maximum  capacity,  which  is  a  25% 
reduction  in  user  involvement  over  AOFU-AP. 

Contextuals  have  a  higher  need  for  user  involvement  than 
Defaulters,  primarily  because  it  is  easy  to  learn  about  De¬ 
faulters,  as  they  are  more  likely  to  be  consistent  with  early 
decisions.  On  the  other  hand,  Contextuals  vary  their  decisions 
based  on  different  contextual  cues  and  require  more  user 
involvement  for  the  model  to  learn  the  cues  used  by  each  user 
and  how  do  they  affect  their  decisions.  Thus,  it  is  important  to 


B.  Decision  Confidence 

In  the  previous  section,  we  looked  into  how  we  can  optimize 
the  learning  phase  by  merging  AOFU  and  the  ML  model  to 
reach  higher  accuracy  with  minimal  user  prompts.  However,  for 
a  small  set  of  users,  more  permission  prompts  will  not  increase 
accuracy,  regardless  of  user  involvement  in  the  boot-  strapping 
phase.  This  could  be  due  to  the  fact  that  a  portion  of  users  in 
our  dataset  are  making  random  decisions,  or  that  the  features 
that  our  ML  model  takes  into  account  are  not  predictive  of  those 
users’  decision  processes.  While  we  do  not  have  the  data  to 
support  either  explanation,  we  examine  how  we  can  measure 
whether  the  ML  model  will  perform  well  for  a  particular  user 
and  quantify  how  often  it  does  not.  We  present  a  method  to 
identify  difficult-to-predict  users  and  reduce  permission 
prompting  for  those  users. 

While  running  the  experiment  in  §VII-A,  we  also  measured 
how  confident  the  ML  model  was  for  each  decision  it  made.  To 
measure  the  ML  modefs  confidence,  we  record  the  probability 
for  each  decision;  since  it  is  a  binary  classification  (deny  or 
allow),  the  closer  the  probability  is  to  0.5,  the  less  confident  it 
is.  We  then  chose  a  class  probability  threshold  above  which  a 
decision  would  be  considered  a  high-confidence  decision.  In 
our  analysis,  we  choose  a  class  probability  threshold  of  0.6, 
since  this  value  resulted  in  >96%  accuracy  for  our  fully-  trained 
model  (25  prompts  per  user)  for  high-confidence  decisions,  but 
this  is  a  tunable  threshold.  Thus,  in  the  remainder  of  our 
analysis,  decisions  that  the  ML  model  made  with  a  probability 
of  >0.60  were  labeled  as  high-confidence  decisions,  while  those 
made  with  a  probability  of  <0.60  were  labeled  as  low- 
confidence  decisions. 

Since  the  most  accurate  version  of  AOFU  uses  12  prompts, 
we  also  evaluate  the  confidence  of  our  model  after  12  AOFU- 
style  prompts.  This  setup  is  identical  to  the  bootstrapping 
approach;  the  model  we  evaluate  here  is  trained  on  responses 
from  other  users  and  the  first  12  prompts  chosen  by  AOFU. 
With  this  scheme,  we  found  that  10  users  (7.63%  of  131  users) 
had  at  least  one  decision  predicted  with  low  confidence.  The 
remaining  92.37%  of  users  had  all  privacy  decisions  predicted 
with  high  confidence.  Among  those  users  whose  decisions 
were  predicted  with  low  confidence,  the  proportion  of  low- 
confidence  decisions  on  average  accounted  for  17.63%  (median 
=  16.67%)  out  of  all  their  predicted  decisions.  With 
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a  sensitive  permission  request  once  every  15  seconds  [43], 
prompting  even  for  17.63%  of  predictions  is  not  practical. 
Users  who  had  low-confidence  predictions  had  a  median  accu¬ 
racy  of  60.17%,  compared  to  98%  accuracy  for  the  remaining 
set  of  users  with  only  high-confidence  predictions.  Out  of  the 
10  users  who  had  low-confidence  predictions,  there  were  no 
Defaulters.  This  further  supports  the  observation  in  Figure  3 
that  Defaulters  require  a  shorter  learning  period. 

In  a  real-world  scenario,  after  the  platform  (ML  model) 
prompts  the  user  for  the  first  12  AOFU  prompts,  the  platform 
can  measure  the  confidence  of  predicting  unlabeled  data 
(sensitive  permission  requests  for  which  the  platform  did  not 
prompt  the  user).  If  the  proportion  of  low-confidence 
predictions  is  below  some  threshold,  the  ML  model  can  be 
deemed  to  have  successfully  learned  user  privacy  preferences 
and  the  platform  should  keep  on  using  the  regular  permission¬ 
prompting  strategy.  Otherwise,  the  platform  may  choose  to 
limit  prompts  (i.e.,  two  per  unique  applica- 
tion:permission:visibility  combination).  It  should  also  be  noted 
that  rather  than  having  a  fixed  number  of  prompts  (e.g.,  12)  to 
measure  the  low-confidence  proportion,  the  platform  can  keep 
track  of  the  low-confidence  proportion  as  it  prompts  the  user 
according  to  any  heuristic  (i.e.,  unique  combinations).  If  the 
proportion  does  not  decrease  with  the  number  of  prompts,  we 
can  infer  that  the  ML  model  is  not  learning  user  preferences 
effectively  or  the  user  is  making  random  decisions,  indicating 
that  limiting  prompts  and  accepting  lower  accuracy  could  be  a 
better  option  for  that  specific  user,  to  avoid  excessive 
prompting.  However,  depending  on  which  group  the  user  is  in 
(Contextual  or  Defaulter),  the  point  at  which  the  platform  could 
make  the  decision  to  continue  or  limit  prompting  could  change. 
In  general,  the  platform  should  be  able  to  reach  this  deciding 
point  relatively  quickly  for  Defaulters. 

Among  participants  with  no  low-confidence  predictions,  we 
had  a  median  error  rate  of  2%  (using  the  new  hybrid  approach 
after  12  AOFU  prompts);  for  the  same  set  of  users,  AOFU 
could  only  reach  a  median  error  rate  of  13.3%.  However,  using 
AOFU,  a  user  in  that  set  would  have  needed  an  average  of 
15.11  prompts  to  reach  that  accuracy.  Using  the  ML  model,  a 
user  would  need  just  9  prompts  on  average  (Defaulters  require 
far  fewer  prompts,  dropping  the  average);  the  model  only 
requires  60%  of  the  prompts  that  AOFU  requires.  Even  with  far 
fewer  prompts  in  the  learning  phase,  the  ML  model  achieves  a 
84.61%  reduction  in  error  rate  relative  to  AOFU. 

While  our  model  may  not  perform  well  for  all  users,  it  does 
seem  to  work  quite  well  for  the  majority  of  users  (92.37%  of 
our  sample).  We  provide  a  way  of  quickly  identifying  users  for 
whom  our  system  does  not  perform  well,  and  propose  limiting 
prompts  to  avoid  excessive  user  burden  for  those  users,  at  the 
cost  of  reduced  efficacy.  In  the  worst  case,  we  could  simply 
employ  the  AOFU  model  for  users  our  system  does  not  work 
well  for,  resulting  in  a  multifaceted  approach  that  is  at  least  as 
good  as  the  status  quo  for  all  users. 


C.  Online  Model 

Our  proposed  system  relies  on  training  models  on  a  trusted 
server,  sending  it  to  client  phones  (i.e.,  as  a  weight  vector),  and 
having  phones  make  classifications.  By  utilizing  an  online 
learning  model,  we  can  train  models  incrementally  as  users 
respond  to  prompts  over  time.  There  are  two  key  advantages  to 
this:  (i)  this  model  adapts  to  changing  user  preferences  over 
time;  (ii)  it  distributes  the  overhead  of  training  increasing  the 
practicality  of  locally  training  the  classifier  on  the  phone  itself. 
Our  scheme  requires  two  components:  a  feature  extraction  and 
storage  mechanism  on  the  phone  (a  small  extension  to  our 
existing  instrumentation)  and  a  machine  learning  pipeline  on  a 
trusted  server.  The  phone  sends  feature  vectors  to  the  server 
every  few  prompts,  and  the  server  responds  with  a  weight  vector 
representing  the  newly  trained  classifier.  To  bootstrap  the 
process,  the  server’s  models  can  be  initialized  with  a  model 
trained  on  a  few  hundred  users,  such  as  our  single  model  across 
all  users.  Since  each  user  contributes  data  points  over  time,  the 
online  model  adapts  to  changing  privacy  preferences  even  if 
they  conflict  with  previous  data.  When  using  this  scheme,  each 
model  takes  less  than  10  KB  to  store.  With  our  current  model, 
each  feature  and  weight  vector  are  at  most  3  KB  each,  resulting 
in  at  most  6  KB  of  data  transfer  per  day. 

To  evaluate  the  accuracy  of  our  online  model,  we  trained  a 
classifier  using  stochastic  gradient  descent  (SGD)  with  five¬ 
fold  cross  validation  on  our  4,224-point  data  set.  This  served  as 
the  bootstrapping  phase.  We  then  simulated  receiving  the 
remaining  data  one-at-a-time  in  timestamp  order.  Any  features 
that  changed  with  time  (e.g.,  running  averages  for  aggregate 
features,  event  counts)  were  computed  with  each  incoming  data 
point,  creating  a  snapshot  of  features  as  the  phone  would  see  it. 
We  then  tested  accuracy  on  the  chronologically  last  20%  of  our 
dataset.  Our  SGD  classifier  had  93.8%  accuracy  (AUC=0.929). 
We  attribute  the  drop  in  accuracy  (compared  to  our  offline 
model)  to  the  fact  that  running  averages  take  multiple  data 
points  to  reach  steady-state,  causing  some  earlier  predictions  to 
be  incorrect. 

A  natural  concern  with  a  trusted  server  is  compromise.  To 
address  this  concern,  we  do  not  send  any  personally- 
identifiable  data  to  the  server,  and  any  features  sent  to  the  server 
are  scaled',  they  are  reported  in  standard  deviations  from  the 
mean,  not  in  raw  values.  Furthermore,  using  an  online  model 
with  incremental  training  allows  us  to  periodically  train  the 
model  on  the  phone  (i.e.,  nightly,  when  the  user  is  charging  her 
device)  to  eliminate  the  need  for  a  trusted  server. 

VIII.  Contextual  Integrity 

Contextual  integrity  is  a  conceptual  framework  that  helps 
explain  why  most  permission  models  fail  to  protect  user 
privacy — they  often  do  not  take  the  context  surrounding  pri¬ 
vacy  decisions  into  account.  In  addressing  this  issue,  we 
propose  an  ML  model  that  infers  when  context  has  changed. 
We  believe  that  this  is  an  important  first  step  towards  opera¬ 
tionalizing  the  notion  of  contextual  integrity.  In  this  section,  we 
explain  the  observations  that  we  made  in  §VI-C  based  on  the 
contextual  integrity  framework  proposed  by  Barth  et  al.  [6]. 
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Contextual  integrity  provides  a  conceptual  framework  to 
better  understand  how  users  make  privacy  decisions;  we  use 
Barth  et  al.’s  formalized  model  [6]  as  a  framework  in  which  to 
view  Android  permission  models.  Barth  et  al.  model  parties  as 
communicating  agents  (P)  knowing  information  represented 
as  attributes  (T).A  knowledge  state  k  is  defined  as  a  subset 
of  P  ^  P-^  T.  We  use  k  =  {p,  q,  f)  to  mean  that  agent  p  knows 
attribute  t  of  agent  q.  Agents  play  roles  (  R)  in  contexts  (  C). 
For  example,  an  agent  can  be  a  game  application,  and 
have  the  role  of  a  game  provider  in  an  entertainment  context. 
Knowledge  transfer  happens  when  informationis  communi¬ 
cated  between  agents;  all  communications  can  be  represented 
through  a  series  of  traces  (k,  (p,  f),  a),  which  are  combinations 
of  a  knowledge  states  a  role  state  (p,  r),  and  a  communi¬ 
cation  action  a  (information  sent).  The  role  an  agent  plays 
in  a  given  context  helps  determine  whether  an  information 
flow  is  acceptable  for  a  user.  The  relationship  between  the 
agent  sending  the  information  and  the  role  of  the  agent  ((p,  r)) 
receiving  the  information  must  follow  these  contextual  norms. 
With  the  Android  permission  model,  the  same  framework 
can  be  applied.  Both  the  user  and  the  third-party  applica¬ 
tion  are  communicating  agents,  and  the  information  to  be 
transferred  is  the  sensitive  data  requested  by  the  applica¬ 
tion.  When  a  third-party  application  requests  permission  to 
access  a  guarded  resource  (e.g.,  location),  knowledge  of  the 
guarded  resource  is  transferred  from  the  one  agent  (i.e.,  the 
user/platform)  to  another  agent  (i.e.,  the  third-party  applica¬ 
tion).  The  extent  to  which  a  user  expects  a  given  request 
depends  not  on  the  agent  (the  application  requesting  the  data), 
but  on  the  role  that  agent  is  playing  in  that  context.  This 
explains  why  the  application  as  a  feature  itself  (i.e.,  application 
name)  was  not  predictive  in  our  models:  this  feature  does  not 
represent  the  role  when  determining  whether  it  is  unexpected. 
While  it  is  difficult  for  the  platform  to  determine  the  exact  role 
an  application  is  playing,  the  visibility  of  the  application  hints 
at  its  role.  For  instance,  when  the  user  is  using  Google  Maps  to 
navigate,  it  is  playing  a  different  role  from  when  Google  Maps 
is  running  in  the  background  without  the  user’s  knowledge. 
We  believe  that  this  is  the  reason  why  the  visibility  of  the 
requesting  application  is  significant:  it  helps  the  user  to  infer 
the  role  played  by  the  application  requesting  the  permission. 
The  user  expects  applications  in  certain  roles  to  access 
resources  depending  on  the  context  in  which  the  request  is 
made.  We  believe  that  the  foreground  application  sets  this 
context.  Thus  a  combination  of  the  role  and  the  context 
decides  whether  an  information  flow  is  expected  to  occur  or 
not.  Automatically  inferring  the  exact  context  of  a  request  is 
likely  an  intractable  problem.  For  our  purposes,  however,  it  is 
possible  that  we  need  to  only  infer  when  context  has  changed, 
or  rather,  when  data  is  being  requested  in  a  context  that  is  no 
longer  acceptable  to  the  user.  Based  on  our  data,  we  believe 
that  features  based  on  foreground  application  and  visibility  are 
most  useful  for  this  purpose,  from  our  collected  dataset. 

We  now  combine  all  of  this  into  a  concrete  example  within 
the  contextual  integrity  framework:  If  a  user  is  using  Google 
Maps  to  reach  a  destination,  the  application  can  play  the 


role  of  a  navigator  in  a  geolocation  context,  whereby  the  user 
feels  comfortable  sharing  her  location.  In  contrast,  if  the  same 
application  requests  location  while  running  as  a  service 
invisible  to  the  user,  the  user  may  not  want  to  provide  the  same 
information.  Background  applications  play  the  role  of  “passive 
listeners”  in  most  contexts;  this  role  as  perceived  by  the  user 
may  be  why  background  applications  are  likelier  to  violate 
privacy  expectations  and  consequently  be  denied  by  users. 

AOFU  primarily  focuses  on  controlling  access  through  rules 
for  application:permission  combinations.  Thus,  AOFU 
neglects  the  role  played  by  the  application  (visibility)  and  relies 
purely  on  the  agent  (the  application)  and  the  information 
subject  (permission  type).  This  explains  why  AOFU  is  wrong 
in  nearly  one-fifth  of  cases.  Based  on  Table  III,  both  AOFU-  VA 
(possibly  identifying  the  role  played  by  the  application)  and 
AOFU-AfPV  (possibly  identifying  the  current  context  because 
of  the  current  foreground  application-Ap)  have  higher  accuracy 
than  the  other  AOFU  combinations.  However,  as  the  contextual 
integrity  framework  suggests,  the  permission  model  has  to  take 
both  the  role  and  the  current  context  into  account  before  making 
an  accurate  decision.  AOFU  (and  other  models  that  neglect 
context)  only  makes  it  possible  to  consider  a  single  aspect,  a 
limitation  that  does  not  apply  to  our  model. 

While  the  data  presented  in  this  work  suggest  the  impor¬ 
tance  of  capturing  context  to  better  protect  user  privacy,  more 
work  is  needed  along  these  lines  to  fully  understand  how  peo¬ 
ple  use  context  to  make  decisions  in  the  Android  permission 
model.  Nevertheless,  we  believe  we  contribute  a  significant 
initial  step  towards  applying  contextual  integrity  to  improve 
smartphone  privacy  by  dynamically  regulating  permissions. 

IX.  Discussion 

The  primary  goal  of  this  research  was  to  improve  the 
accuracy  of  the  Android  permission  system  so  that  it  more 
correctly  aligns  with  user  privacy  preferences.  We  began  with 
four  hypotheses:  (i)  that  the  currently  deployed  AOFU  policy 
frequently  violates  user  privacy;  (ii)  that  the  contextual  infor¬ 
mation  it  ignores  is  useful;  (iii)  that  a  ML-based  classifier  can 
account  for  this  contextual  information  and  thus  improve  on  the 
status  quo;  and  (iv)  that  passively  observable  behavioral  traits 
can  be  used  to  infer  privacy  preferences. 

To  test  these  hypotheses,  we  performed  the  first  large-scale 
study  on  the  effectiveness  of  AOFU  permission  systems  in  the 
wild,  which  showed  that  hypotheses  (i)  and  (ii)  hold.  We 
further  built  an  ML  classifier  that  took  user  permission 
decisions  along  with  observations  of  user  behaviors  and  the 
context  surrounding  those  decisions  to  show  that  (iii)  and  (iv) 
hold.  Our  results  show  that  existing  systems  have  significant 
room  for  improvement,  and  other  permission-granting  systems 
may  benefit  from  applying  our  results. 

A.  Limitations  of  Permission  Models 

Our  field  study  confirms  that  users  care  about  their  privacy 
and  are  wary  of  permission  requests  that  violate  their  expec¬ 
tations.  We  observed  that  95%  of  participants  chose  to  block  at 
least  one  permission  request;  in  fact,  the  average  denial  rate  was 
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60% — a  staggering  amount  given  that  the  AOI  model  permits 
all  permission  requests  for  an  installed  application. 

While  AOFU  improves  over  the  AOI  model,  it  still  violates 
user  privacy  around  one  in  seven  times,  as  users  deviate  from 
their  initial  responses  to  permission  requests.  This  amount  is 
significant  because  of  the  high  frequency  of  sensitive  permis¬ 
sion  requests:  a  15%  error  rate  yields  thousands  of  privacy 
violations  per  user — based  on  the  latest  dataset,  this  amounts  to 
a  potential  privacy  violation  every  minute.  It  further  shows  that 
AOFU’s  correctness  assumption — that  users  make  binary  deci¬ 
sions  based  only  on  the  application:permission  combination — 
is  incorrect.  Users  take  a  richer  space  of  information  into 
account  when  making  decisions  about  permission  requests. 

B.  Our  ML-Based Model 

We  show  that  ML  techniques  are  effective  at  learning  from 
both  the  user’s  previous  decisions  and  the  current  environmen¬ 
tal  context  in  order  to  predict  whether  to  grant  permissions  on 
the  user’s  behalf.  In  fact,  our  techniques  achieve  better  results 
than  the  methods  currently  deployed  on  millions  of  phones 
worldwide — while  imposing  significantly  less  user  burden. 

Our  work  incorporates  elements  of  the  surrounding  context 
into  a  machine-learning  model.  This  better  approximates  user 
decisions  by  finding  factors  relevant  for  users  that  are  not 
encapsulated  by  the  AOFU  model.  In  fact,  our  ML  model 
reduces  the  errors  made  by  the  AOFU  model  by  75%.  Our  ML 
model’s  97%  accuracy  is  a  substantial  improvement  over 
AOFU’s  85%  and  AOI’s  25%;  the  latter  two  of  whichcomprise 
the  status  quo  in  the  Android  ecosystem. 

Our  research  shows  that  many  users  make  neither  random 
nor  fixed  decisions:  the  environmental  context  plays  a  signif¬ 
icant  role  in  user  decision-making.  Automatically  detecting  the 
precise  context  surrounding  a  request  for  sensitive  data  is  an 
incredibly  difficult  problem  (e.g.,  inferring  how  data  will  be 
used),  and  is  potentially  intractable.  However,  to  better  support 
user  privacy,  that  problem  does  not  need  to  be  solved;  instead, 
we  show  that  systems  can  be  improved  by  using  environmental 
data  to  infer  when  context  has  changed.  We  found  that  the  most 
predictive  factors  in  the  environmental  context  were  whether 
the  application  requesting  the  permission  is  visible,  and  what 
the  foreground  application  the  user  is  engaged  with.  These  are 
both  strong  contextual  cues  used  by  users,  insofar  as  they 
allowed  us  to  better  predict  changes  in  context.  Our  results 
show  that  ML  techniques  have  great  potential  in  improving  user 
privacy,  by  allowing  us  to  infer  when  context  has  changed,  and 
therefore  when  users  would  want  data  requests  to  be  brought 
to  their  attention. 

C.  Reducing  the  User  Burden 

Our  work  is  also  novel  in  using  passively  observable  data  to 
infer  privacy  decisions:  we  show  that  we  can  predict  a  user’s 
preferences  without  any  permission  prompts.  Our  model 
trained  solely  on  behavioral  traits  yields  a  three-fold 
improvement  over  AOI;  for  Defaulters — who  account  for  53% 
of  our  sample — it  was  as  accurate  as  AOFU-AP.  These  results 
demonstrate  that  we  can  match  the  status  quo  without  any 


active  user  involvement  (i.e.,  the  need  for  obtrusive  prompts). 
These  results  imply  that  learning  privacy  preferences  may  be 
done  entirely  passively,  which,  to  our  knowledge,  has  not  yet 
been  attempted  in  this  domain.  Our  behavioral  feature  set 
provides  a  promising  new  direction  to  guide  research  in 
creating  permission  models  that  minimize  user  burden. 

The  ML  model  trained  with  contextual  data  and  past 
decisions  also  significantly  reduced  the  user  burden  while 
achieving  higher  accuracy  than  AOFU.  The  model  yielded  an 
81%  reduction  in  prediction  errors  while  reducing  user 
involvement  by  25%.  The  significance  of  this  observation  is 
that  by  reducing  the  risk  of  habituation,  it  increases  reliability 
when  user  input  is  needed. 

D.  User-  and  Permission-Tailored  Models 

Our  ML-based  model  incorporates  data  from  all  users  into  a 
single  predictive  model.  It  may  be  the  case,  however,  that  a 
collection  of  models  tailored  to  particular  types  of  users 
outperforms  our  general-purpose  model — ^provided  that  the 
correct  model  is  used  for  the  particular  user  and  permission.  To 
determine  if  this  is  true,  we  clustered  users  into  groups  based 
first  on  their  behavioral  features,  and  then  their  denial  rate,  to 
see  if  we  could  build  superior  cluster- tailored  ML  models. 
Having  data  for  only  131  users,  however,  resulted  in  clusters 
too  small  to  carry  out  an  effective  analysis.  We  note  that  we  also 
created  a  separate  model  for  each  sensitive  permission  type, 
using  data  only  for  that  permission.  Our  experiments 
determined,  however,  that  these  models  were  no  better  (and 
often  worse)  than  our  general  model.  It  is  possible  that  such 
tailored  models  may  be  more  useful  when  our  system  is 
implemented  at  scale. 

E.  Attacking  the  ML  Model 

Attacking  the  ML  model  to  get  access  to  users’  data  without 
prompting  is  a  legitimate  concern  [5].  There  are  multiple  ways 
an  adversary  can  influence  the  proposed  permission  model:  (i) 
imposing  an  adversarial  ML  environment  [31];  (ii)  polluting  the 
training  set  to  bias  the  model  to  accept  permissions;  and  (iii) 
manipulating  input  features  in  order  to  get  access  without  user 
notification.  We  assume  in  this  work  that  the  platform  is  not 
compromised;  a  compromised  platform  will  degrade  any 
permission  model’s  ability  to  protect  resources. 

A  thorough  analysis  on  this  topic  is  outside  of  our  scope. 
Despite  that,  we  looked  at  the  possibility  of  manipulating 
features  to  get  access  to  resources  without  user  consent.  None 
of  the  behavioral  features  used  in  the  model  can  be  influenced, 
since  that  would  require  compromising  the  platform.  An 
adversary  can  control  the  runtime  features  for  a  given 
permission  request  by  specifically  choosing  when  to  request  the 
permission.  We  generated  feature  vectors  manipulating  every 
adversary-controlled  value  and  combination  from  our  dataset, 
and  tested  them  on  our  model.  We  did  not  find  any  conclusive 
evidence  that  the  adversary  can  exploit  the  ML  model  by 
manipulating  the  input  features  to  get  access  to  resources 
without  user  consent. 
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As  this  is  not  a  comprehensive  analysis  on  attack  vectors,  it 
is  possible  that  a  scenario  exists  where  the  adversary  is  able  to 
access  sensitive  resources  without  prompting  the  user  first.  Our 
preliminary  analysis  suggests  that  such  attacks  may  be  non¬ 
trivial,  but  more  work  is  needed  to  study  and  prevent  such 
attacks,  particularly  examining  adversarial  ML  techniques  and 
feature  brittleness. 

F.  Experimental  Caveat 

We  repeat  a  caveat  about  our  experimental  data:  users  were 
free  to  deny  permissions  without  any  consequences.  We 
explicitly  informed  participants  in  our  study  that  their  decisions 
to  deny  permission  requests  would  have  no  impact  on  the  actual 
behavior  of  their  applications.  This  is  important  to  note  because 
if  an  application  is  denied  a  permission,  it  may  exhibit 
undefined  behavior  or  lose  important  functionality.  In  fact, 
researchers  have  noted  that  many  applications  crash  when 
permissions  are  denied  [13].  If  these  consequences  are  imposed 
on  users,  they  may  decide  that  the  functionality  is  more 
important  than  their  privacy  decision. 

If  we  actually  denied  permissions,  users’  decisions  may  skew 
towards  a  decreased  denial  rate.  The  denial  rates  in  our 
experiments  therefore  represent  the  actual  privacy  preferences 
of  users  and  their  expectations  of  reasonable  application 
behavior — not  the  result  of  choosing  between  application  func¬ 
tionality  and  privacy.  We  believe  that  how  people  react  when 
choosing  between  functionality  and  privacy  preferences  is  an 
important  research  question  beyond  the  scope  of  this  paper. 
Such  a  change,  however,  will  not  limit  this  contribution,  since 
our  proposed  model  was  effective  in  guarding  resources  of  the 
users  who  are  selective  in  their  decision  making — the  proposed 
classifier  reduced  the  error  rate  of  Contextuals  by  44%. 

We  believe  that  there  are  important  unanswered  questions 
about  how  to  solve  the  technical  hurdles  surroundingenforcing 
restrictive  preferences  with  minimal  usability  issues.  As  a  first 
step  towards  building  a  platform  that  does  not  force  users  to 
choose  between  their  privacy  preferences  and  required  func¬ 
tionality,  we  must  develop  an  environment  where  permissions 
appear — to  the  application — to  be  allowed,  but  in  reality  only 
spurious  or  artificial  data  is  provided. 

G.  Types  of  Users 

We  presented  a  categorization  of  users  based  on  the  sig¬ 
nificance  that  the  application’s  visibility  played  towards  their 
individual  privacy  decisions.  We  believe  that  in  an  actual 
permission  denial  setting,  the  distribution  will  be  different  from 
what  was  observed  in  our  study.  Our  categorization’s 
significance,  however,  motivates  a  deeper  analysis  on  under¬ 
standing  the  factors  that  divide  Contextuals  and  Defaulters. 
While  visibility  was  an  important  factor  in  this  division,  there 
may  be  others  that  are  significant  and  relevant.  More  work 
needs  to  be  done  to  explore  how  Contextuals  make  decisions 
and  which  behaviors  correlate  with  their  decisions. 

H.  User  Interface  Panel 

Any  model  that  predicts  user  decisions  has  the  risk  of  making 
incorrect  predictions.  Making  predictions  on  a  user’s  behalf. 


however,  is  necessary  because  permissions  are  re-  quested  by 
applications  with  too  high  a  frequency  for  manual  examination. 
While  we  do  not  expect  any  system  to  be  able  to  obtain  perfect 
accuracy,  we  do  expect  that  our  97%  accuracy  can  be  improved 
upon. 

One  plausible  way  of  improving  the  accuracy  of  the  per¬ 
mission  model  is  to  empower  the  user  to  review  and  make 
changes  on  how  the  ML  model  makes  decisions  through  a  user 
feedback  panel.  This  gives  users  recourse  to  correct  undesirable 
decisions.  The  UI  panel  could  also  be  used  to  reduce  the 
usability  issues  and  functionality  loss  stemming  from 
permission  denial.  The  panel  should  help  the  user  figure  out 
which  rule  incurred  the  functionality  loss  and  to  change  it 
accordingly.  A  user  may  also  use  this  to  adjust  their  settings  as 
their  privacy  preferences  evolve  over  time. 

I.  The  Cost  of  Greater  Control 

A  more  restrictive  platform  means  users  will  have  greater 
control  over  the  data  being  shared  with  third  parties.  Applica¬ 
tions  that  generate  revenue  based  on  user  data,  however,  could 
be  cut  off  from  their  primary  revenue  source.  Such  an  effect 
could  disrupt  the  current  eco-system  and  force  app  developers 
to  degrade  app  functionality  based  on  the  availability  of  the 
data.  We  believe  the  current  eco-system  is  unfairly  biased 
against  users  and  tighter  control  will  make  the  user  an  equal 
stakeholder.  While  more  work  is  needed  to  understand  the  ef¬ 
fects  of  a  more  restrictive  platform,  we  believe  it  is  imperative 
to  let  the  user  have  greater  control  over  their  own  data. 

J.  Conclusions 

We  have  shown  a  number  of  important  results.  Users  care 
about  their  privacy:  they  deny  a  significant  number  of  requests 
to  access  sensitive  data.  Existing  permission  models  for  An¬ 
droid  phones  still  result  in  significant  privacy  violations.  Users 
may  allow  permissions  sometimes,  while  denying  them  at 
others,  implying  that  there  are  more  factors  that  go  into  the 
decision-making  process  than  simply  the  application  name  and 
the  permission  type.  We  collected  real-world  data  from  131 
users  and  found  that  application  visibility  and  the  current  fore¬ 
ground  application  were  important  factors  in  user  decisions.  We 
used  the  data  we  collected  to  build  a  machine-learning  model 
to  make  automatic  permission  decisions.  One  of  our  models  had 
a  comparable  error  rate  to  AOFU  and  benefited  from  not 
requiring  any  user  prompting.  Another  of  our  models  required 
some  user  prompts — less  than  is  required  by  AOFU —  and 
achieved  a  reduction  of  AOFU’s  error  rate  by  81%. 
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Appendix  A 

Information  Gain  of  Contextuaf  Features 


Contextual 

Defaulters 

Overall 

AI 

0.4839 

0.6444 

0.5717 

A2 

0.4558 

0.6395 

0.5605 

Permission 

0.0040 

0.0038 

0.0050 

Time 

0.0487 

0.1391 

0.0130 

Visibility 

0.0015 

0.0007 

0.0010 

TABLE  VI 

Feature  Importance  oe  Contextual  Features 


Appendix  B 

Information  Gain  of  Behavioral  Features 


Feature 

Importance 

Amount  of  time  spent  on  audio  calls 

0.327647825 

Frequency  of  audio  calls 

0.321291184 

Proportion  of  times  screen  was  timed  out 
instead  of  pressing  the  lock  button 

0.317631096 

Number  of  times  PIN  was  used  to 
unlock  the  screen. 

0.305287288 

Number  of  screen  unlock  attempts 

0.299564131 

Amount  of  time  spent  unlocking  the  screen 

0.29930659 

Proportion  of  time  spent  on  loud  mode 

0.163166296 

Proportion  of  time  spent  on  silent  mode 

0.138469725 

Number  of  times  a  website  is  loaded  to 
the  Chrome  browser 

0.094996437 

Out  of  all  visited  websites,  the  proportion 
of  HTTPS-secured  websites. 

0.071096898 

Number  of  times  Password  was  used  to 
unlock  the  screen 

0.067999523 

Proportion  of  websites  requested  location 
through  Chrome 

0.028404167 

Time 

0.019799623 

The  number  of  downloads  through  Chrome 

0.014619351 

Permission 

0.001461635 

Visibility 

0.000162166 

TABLE  VII 

Feature  Importance  oe  Behavioral  Features 
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Appendix  C  TurtleGuard:  Helping  Android  Users  Apply  Contextual  Privacy 
Preferences 
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ABSTRACT 

Current  mobile  platforms  provide  privacy  management  in¬ 
terfaces  to  regulate  how  applications  access  sensitive  data. 

Prior  research  has  shown  how  these  interfaces  are  insufficient 
from  a  usability  standpoint:  they  do  not  account  for  context. 
In  allowing  for  more  contextual  decisions,  machine-learning 
techniques  have  shown  great  promise  for  designing  systems 
that  automatically  make  privacy  decisions  on  behalf  of  the 
user.  However,  if  such  decisions  are  made  automatically,  then 
feedback  mechanisms  are  needed  to  empower  users  to  both 
audit  those  decisions  and  correct  any  errors. 

In  this  paper,  we  describe  our  user-centered  approach  to¬ 
wards  designing  a  fully  functional  privacy  feedback  interface 
for  the  Android  platform.  We  performed  two  large-scale  user 
studies  to  research  the  usability  of  our  design.  Our  second, 
580-person  validation  study  showed  that  users  of  our  new 
interface  were  significantly  more  likely  to  both  understand 
and  control  the  selected  set  of  circumstances  under  which 
applications  could  access  sensitive  data  when  compared  to 
the  default  Android  privacy  settings  interface. 

1.  INTRODUCTION 

Smartphones  store  a  great  deal  of  personal  information,  such 
as  the  user’s  contacts,  location,  and  call  history.  Mobile  op¬ 
erating  systems  use  permission  systems  to  control  access  to 
this  data  and  prevent  potentially  malicious  third-party  ap¬ 
plications  (“apps”)  from  obtaining  sensitive  user  data.  Part  of 
the  purpose  of  these  permission  systems  is  to  inform  and 
empower  users  to  make  appropriate  decisions  about  which 
apps  have  access  to  which  pieces  of  personal  information. 

The  popular  open-source  Android  mobile  platform  has  used 
two  general  approaches  to  give  users  control  over  permis¬ 
sions.  Initially,  permissions  were  presented  as  an  install¬ 
time  ultimatum,  or  ask-on-install  (AOI):  at  installation,  an 
application  would  disclose  the  full  list  of  sensitive  resources 
it  wished  to  access.  Users  could  either  grant  access  to  all  re- 
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quested  permissions  or  abort  the  installation  entirely.  Prior 
research  has  shown  that  most  users  do  not  pay  attention  to  or 
do  not  these  prompts  when  shown  at  install-time  [12]. 

Recently,  an  ask-on-first-USe  (AOFU)  permission  system  re¬ 
placed  install-time  disclosures  on  Android.  Under  AOFU, 
the  user  is  prompted  when  an  application  requests  a  sensi¬ 
tive  permission  for  the  first  time.  The  user’ s  response  to  this 
permission  request  carries  forward  to  all  future  requests  by 
that  application  for  that  permission.  The  AOFU  approach, 
however,  fails  to  consider  that  the  user’ s  preferences  may 
change  in  different  contexts.  It  only  learns  the  user’ s  prefer¬ 
ences  once  under  a  certain  set  of  contextual  circumstances: 
the  first  time  an  application  tries  to  access  a  particular  data 
type.  This  system  does  not  account  for  the  fact  that  subse¬ 
quent  requests  may  occur  under  different  contextual  circum¬ 
stances  and  therefore  may  be  deemed  less  appropriate.  For 
instance,  a  user  might  feel  comfortable  with  an  application 
requesting  location  data  to  deliver  desirable  location-based 
functionality.  The  same  user,  however,  might  find  it  unac¬ 
ceptable  for  the  same  application  to  access  location  for  the 
purposes  of  behavioral  advertising,  possibly  when  the  appli¬ 
cation  is  not  even  being  used. 

The  contextual  integrity  framework  can  explain  why  AOFU 
is  insufficient:  it  fails  to  protect  user  privacy  because  it 
does  not  account  for  the  context  surrounding  dataflows  [25] . 
That  is,  privacy  violations  occur  when  a  data  flow  (e.g.,  an 
app’s  access  to  a  sensitive  resource)  defies  user  expectations. 
In  recent  work  [38,  39],  we  showed  that  mobile  users  do  make 
contextual  privacy  decisions:  decisions  to  allow  or  deny  ac¬ 
cess  are  based  on  what  they  were  doing  on  their  mobile  de¬ 
vices  at  the  time  that  data  was  requested. 

In  theory,  asking  the  user  to  make  a  decision  for  every  re¬ 
quest  is  optimal,  as  the  user  will  be  able  to  account  for  the 
surrounding  context  and  can  then  make  decisions  on  a  case- 
by-case  basis.  In  practice,  however,  this  results  in  unusable 
privacy  controls,  as  the  frequency  of  these  requests  could 
overwhelm  the  user  [38].  Consequently,  automating  these 
decisions  with  machine  learning  yields  a  balance  between 
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accurately  implementing  users’  privacy  preferences  and  not 
overburdening  them  with  too  many  decisions  [39].  Such  au¬ 
tomation  requires  the  platform  to  have  feedback  mechanisms 
so  that  automated  decisions  can  be  reviewed  and  errors  can 
be  corrected,  thereby  yielding  fewer  future  errors. 

To  this  end,  we  designed  a  novel  permission  manager,  Turtle- 
Guard,  which  helps  users  to  vary  their  privacy  preferences 
based  on  a  few  selected  contextual  circumstances.  It  also 
provides  information  about  the  apps  that  they  use,  by  pro¬ 
viding  a  feedback  loop  for  them  to  audit  and  modify  how 
automated  decisions  are  made.  TurtleGuard  allows  users  to 

(i)  vary  their  decisions  based  on  the  visibility  of  the  request¬ 
ing  application  -  our  previous  work  showed  that  the  visibil¬ 
ity  of  the  requesting  application  is  a  critical  factor  used  by 
users  when  making  mobile  app  privacy  decisions  [38],  and 

(ii)  have  an  improved  understanding  of  how  third-party  ap¬ 
plications  access  resources  in  the  real  world  and  under  vary¬ 
ing  contextual  circumstances. 

We  conducted  an  initial  400-person  experiment  to  evaluate 
our  preliminary  design.  Based  on  our  analysis  of  this  data, 
we  then  iterated  on  our  design,  conducting  a  580-person  vali¬ 
dation  study  to  demonstrate  our  design’ s  effectiveness.  Both 
experiments  had  four  tasks:  three  tasks  that  involved  using 
the  system  to  locate  information  about  current  application 
permissions,  and  one  task  that  involved  modifying  settings. 
We  observed  that  participants  who  used  TurtleGuard  were 
significantly  more  likely  to  vary  their  privacy  preferences 
based  on  surrounding  circumstances  than  the  control  group. 
We  believe  that  these  results  are  a  critical  contribution  to¬ 
wards  empowering  mobile  users  to  make  privacy  decisions  on 
mobile  phone  platforms.  Our  contributions  are  as  follows: 

•  We  present  the  first  contextually-aware  permission  man¬ 
ager  for  third-party  applications  in  Android. 

•  We  show  that  when  using  our  new  interface  (compared 
to  the  existing  Android  interface)  participants  were 
significantly  more  likely  to  both  understand  when  ap¬ 
plications  had  foreground  versus  background  access  to 
sensitive  data  and  how  to  correctly  control  it. 

•  We  show  that  our  proposed  interface  has  a  minimal 
learning  curve.  Participants,  who  had  never  used  Turtle- 
Guard  before,  were  as  successful  at  accomplishing  in¬ 
formation  retrieval  tasks  as  those  who  used  the  existing 
Android  interface. 

2.  RELATED  WORK 

The  Android  OS  has  thus  far  used  two  different  permission 
models:  ask-on-install  (AOI)  permissions,  and  ask-on-first- 
use  (AOFU)  permissions.  Versions  of  Android  before  ver¬ 
sion  6.0  (Marshmallow)  implemented  ask-on-install  permis¬ 
sions.  Under  this  model,  applications  request  that  the  user 
grant  all  permissions  to  the  application  at  install  time.  The 
user  must  consent  to  all  requested  permissions  in  order  to 
complete  installation.  Otherwise,  if  the  user  wishes  to  deny 
any  permission,  the  only  option  available  is  to  abort  the  in¬ 
stallation  entirely.  Research  has  shown  that  few  users  read 
install  time  permissions,  and  fewer  still  correctly  understand 
their  meaning  [12,  18]. 

Versions  of  Android  from  6.0  (Marshmallow)  onward  use 
the  AOFU  permission  model  instead.  Under  AOFU,  appli¬ 
cations  prompt  users  for  sensitive  permissions  at  runtime. 


These  prompts  protect  access  to  a  set  of  24  “dangerous 
permissions,’’  including  geolocation  data,  contact  lists,  and 
SMS.  Prompts  appear  when  the  application  attempts  to  re¬ 
quest  protected  resources  for  the  first  time.  This  has  the 
advantage  of  giving  users  contextual  clues  about  why  an  ap¬ 
plication  requires  a  protected  resource:  users  can  consider 
what  they  are  doing  when  the  prompt  appears  to  help  de¬ 
termine  whether  to  approve  the  request.  Although  AOFU 
offers  an  improvement  over  the  install-time  model  in  this 
regard,  first-use  prompts  insufficiently  capture  a  user’s  pri¬ 
vacy  preferences  [39].  That  is,  the  AOFU  model  does  not 
consider  scenarios  where  an  application  requests  access  to 
data  under  varying  contexts. 

Research  on  permission  models  has  found  that  users  are  of¬ 
ten  unaware  how  apps  access  protected  resources  and  how 
access  may  be  regulated  [12,  8,  11,36,34].  Shih  et  al.  showed 
that  users  are  more  likely  to  disclose  privacy  information 
when  the  purpose  is  unclear  [35].  Prior  work  has  specifically 
analyzed  location  data:  Benisch  et  al.  show  that  a  vast  num¬ 
ber  of  factors  (time,  day,  location)  contribute  to  disclosure 
preferences  [5];  Reilly  et  al.  show  that  users  want  minimal 
interaction  with  their  technology  [31].  Additionally,  Patil  et 
al.  takes  into  consideration  context:  they  suggest  making 
feedback  actionable  and  allowing  for  selective  control  re¬ 
garding  location  data  [29].  They  also  show  that  users  have 
difficulty  articulating  location  access  controls,  and  suggest  an 
interface  that  includes  contextual  factors  as  a  potential 
solution  [28].  Almuhimedi  et  al.  studied  AppOps,  a  per¬ 
mission  manager  introduced  in  Android  4.3  but  removed  in 
Version  4.4.2  [1].  AppOps  allowed  users  to  review  and  mod¬ 
ify  application  permissions  once  installed,  as  well  as  set  de¬ 
fault  permissions  that  newly  installed  applications  must  fol¬ 
low.  They  examined  privacy  nudges  that  were  designed  to 
increase  user  awareness  of  privacy  risks  and  facilitate  the  use 
of  AppOps.  They  concluded  that  Android  users  benefitfrom 
the  use  of  a  permission  manager,  and  that  privacy  nudges  are 
an  effective  method  of  increasing  user  awareness  [1]. 

Although  AppOps  was  removed  from  Android,  Android  6.0 
(Marshmallow)  reintroduced  permission  management.  It — 
and  subsequent  versions  as  of  this  writing — include  an  up¬ 
dated  interface  that  allows  the  user  to  view  all  of  the  per¬ 
missions  that  a  particular  app  has  been  granted,  as  well  as  all 
of  the  apps  that  have  been  granted  a  particular  permission 
(Figure  1).  Unfortunately,  these  controls  are  buried  deep 
within  the  Settings  app,  and  it  is  therefore  unlikely  that  users 
are  aware  of  them.  For  instance,  viewing  a  particular  app’s 
permissions  requires  navigating  four  levels  of  sub-panels, 
whereas  viewing  all  the  apps  that  haverequested  a  particular 
permission  requires  navigating  five  levels.  By  comparison, 
TurtleGuard  is  one  click  from  the  main  Settings  panel  and 
explicitly  presents  the  relationships  between  ap-  plications, 
permissions,  and  controls. 

XPrivacy  [6],  DonkeyGuard  [7],  Permission  Master  [23],  and 
LineageOS’s*  Privacy  Guard  [24]  are  examples  of  other  third- 
party  permission  management  software.  These  utilities  re¬ 
quire  additional  privileges  and  techniques  to  install  because 
Android  provides  no  official  mechanisms  for  third-party  pro¬ 
grams  to  modify  the  permission  system.  For  instance.  Pri¬ 
vacy  Guard  is  built  into  the  LineageOS  custom  ROM  [24]; 


'UineageOS  is  a  recent  fork  of  CyanogenMod  after  the  lat¬ 
ter’s  discontinuation. 
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Figure  1 :  After  navigating  four  and  five  levels  of  sub-panels 
within  the  Android  Settings  app,  respectively,  users  can  limit 
a  specific  app’s  access  to  specific  permissions  (left)  or  limit 
the  apps  that  can  access  a  particular  permission  (right). 


others  use  the  Xposed  Framework  [32],  which  requires  an 
unlocked  bootloader  and  a  custom  recovery  partition.  Such 
restrictions  are  necessary  to  prevent  malicious  software  from 
interfering  with  the  existing  permission  system. 

Third-party  permission  managers  offer  users  a  variety  of  fea¬ 
tures  to  fine-tune  access  to  sensitive  resources  on  their  de¬ 
vices.  XPrivacy  has  the  option  to  pass  fake  data  to  applica¬ 
tions  that  have  been  denied  access  to  protected  resources  [2]. 
Flomyack  et  al.’s  AppFence  similarly  allows  users  to  deny 
permissions  to  applications  by  providing  fake  data  [16].  Pro¬ 
viding  fake  data  is  more  desirable  than  simply  failing  to  pro¬ 
vide  any  data  at  all,  as  the  latter  may  cause  functionality  loss 
or  application  failures. 

These  managers  follow  an  Identity  Based  Access  Control 
model  (IBAC),  where  individual  permissions  can  be  set  for 
each  app.  Although  this  model  allows  users  to  specify  fine¬ 
grained  permission  preferences,  this  may  be  ineffective  in 
practice  for  two  reasons.  First,  users  may  be  overwhelmed  by 
the  number  of  settings  available  to  them,  some  of  which  are 
only  tangentially  relevant  to  privacy.  This  security  de-  sign 
failure  is  known  as  the  wall  of  checkboxes  [14].  XPrivacy  and 
Permission  Master  show  controls  for  resources  whose  direct 
effects  on  user  privacy  are  unclear,  such  as  keeping  a  device 
awake.  TurtleGuard  improves  usability  by  showing  only 
controls  for  resources  deemed  “dangerous”  in  the  Android 
platform  [15]  and  others  that  previous  research  has  shown  are 
conducive  to  using  run-time  prompts  [10].  Second,  none  of 
the  existing  permission  managers  display  the  context  in 
which  protected  resources  were  accessed.  XPrivacy,  Donkey 
Guard,  and  LineageOS’s  Privacy  Guard  pro-  vide  timestamps 
for  resource  accesses,  but  the  user  does  not  receive  important 
information  about  the  app’s  state,  such  as  whether  it  was 
actively  being  used  when  it  requested  access 


to  sensitive  data.  Permission  Master  offers  no  historical  in¬ 
formation  at  all.  TurtleGuard  partially  addresses  this  prob¬ 
lem  by  listing  recently  allowed  and  denied  permission  access 
requests,  along  with  the  state  and  visibility  of  the  requesting 
application  at  the  time  the  permission  was  requested. 

Apple’s  iOS  platform  offers  visibility-sensitive  location  pri¬ 
vacy  settings;  “Never”  and  “Always”  (the  two  settings  anal¬ 
ogous  to  Android’s  permission  on/o4  toggles),  and  a  “While 
using  the  app”  option,  which  only  permits  an  application  to 
access  location  data  while  the  application  is  active  on  the 
screen.  TurtleGuard  uses  the  same  options,  but  our  design  is 
novel  in  both  the  extent  of  these  settings  and  in  who  controls 
them.  Apple’s  iOS  allows  developers  to  control  which  of  the 
three  options  are  available  to  users  to  select  [3].  Application 
developers  have  faced  criticism  for  removing  the  “While  us¬ 
ing  the  app”  option,  forcing  users  to  choose  between  reduced 
functionality  and  granting  the  application  unrestricted  ac¬ 
cess  to  sensitive  location  data  [26].  Our  design,  by  contrast, 
gives  users  all  three  of  these  options  for  all  sensitive  permis¬ 
sions  (Table  5,  Appendix).  Furthermore,  developers  cannot 
restrict  user  choice  with  these  settings,  as  TurtleGuard  is 
implemented  in  the  operating  system. 

Wijesekera  et  al.  show  that  although  AOFU  improves  on 
install-time  permissions,  AOFU  is  insufficient  because  it  does 
not  account  for  the  context  of  the  requests  [39].  They  exam¬ 
ined  this  by  instrumenting  the  Android  platform  to  log  all 
instances  of  apps  accessing  sensitive  resources.  In  addition  to 
their  instrumentation,  the  platform  randomly  prompted  users 
about  the  appropriateness  of  various  permission  re-  quests  as 
those  requests  occurred.  Participant  responses  to  these 
prompts  were  treated  as  the  dependent  variable  for  a  training 
set.  Their  study  showed  that  95%  of  participants  would  have 
chosen  to  block  at  least  one  access  request  had  the  system 
notified  them.  On  average,  participants  would  have  preferred 
to  block  60%  of  permission  requests.  Indeed,  other  work 
suggests  that  contextual  cues  are  key  in  detecting  privacy 
violations  [25,4]. 

A  natural  extension  of  AOFU  is  “ask  on  every  use”:  rather 
than  extrapolating  the  user’s  first-time  preference  to  all  fu¬ 
ture  accesses  to  a  given  resource,  each  access  instead  requires 
user  input.  Such  a  model  would  conceivably  allow  users  to 
accurately  specify  their  contextual  preferences  because  they 
know  exactly  which  app  attempted  to  gain  access  to  what  re¬ 
source  under  which  circumstance.  This  approach,  however, 
is  unusable  in  practice.  Research  has  shown  that  applica¬ 
tions  request  access  to  permission-protected  resources  with 
great  frequency:  on  an  average  smartphone,  roughly  once 
every  15  seconds  [38].  Such  a  high  frequency  not  only  risks 
habituation,  but  would  render  the  device  inoperable. 

Recent  research  on  permission  models  has  turned  towards 
using  machine  learning  (ML)  [39,  20,  21,  19].  One  advan¬ 
tage  is  ML’s  ability  to  incorporate  nuanced  contextual  data 
to  predict  user  preferences;  the  approach  has  shown  signif¬ 
icantly  lower  error  rates  over  the  status  quo,  i.e.,  AOFU. 
Wijesekera  et  al.  [39]  also  showed  that  ML  reduces  user  in¬ 
volvement,  thereby  minimizing  habituation.  They  empha¬ 
size,  however,  the  importance  of  having  a  user  interface  that 
functions  as  a  feedback-loop  to  the  classifier,  since  no  prac¬ 
tical  classifier  will  ever  be  100%  accurate.  Users  can  use  the 
interface  to  audit  the  decisions  made  by  the  classifier  and 
correct  any  decisions  that  do  not  match  their  preferences. 
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Such  a  mechanism  not  only  ensures  that  the  classifier  im¬ 
proves  its  accuracy  over  time,  it  also  keeps  users  aware  of  de¬ 
cisions  that  were  made  on  their  behalves  and  informs  them 
of  how  third-party  apps  are  accessing  sensitive  resources  un¬ 
der  various  circumstances. 

TurtleGuard  provides  two  core  components  necessary  for  us¬ 
ability  under  such  contextual  privacy  models:  we  provide 
users  with  key  contextual  information  when  regulating  ac¬ 
cess  to  sensitive  resources,  and  we  provide  a  method  for  users 
to  audit  and  correct  the  decisions  that  have  been  automati¬ 
cally  made  by  the  system. 

3.  DESIGN  OBJECTIVES 

TurtleGuard’ s  primary  function  is  to  inform  users  about  the 
decisions  that  have  been  automatically  made  on  their  be¬ 
half,  while  allowing  them  to  easily  correct  errors  (thereby 
improving  the  accuracy  of  future  decisions).  These  errors  can 
be  either  false  positives — an  app  is  denied  a  permission  that 
it  actually  needs  to  function — or  false  negatives — an  app  is 
granted  access  to  data  against  the  user’s  preferences. 

Thompson  et  al.  showed  how  attribution  mechanisms  can  help 
users  better  understand  smartphone  application  resource 
accesses  [37].  They  found  that  users  expect  this  information 
to  be  found  in  the  device’s  Settings  app.  In  our  initial  ex¬ 
periment,  we  evaluated  TurtleGuard  as  a  standalone  app, 
though  for  this  reason  we  ultimately  moved  it  within  the 
Android  Settings  panel  prior  to  our  validation  experiment. 

3.1  Incorporating  Context 

In  prior  work,  researchers  observed  that  only  22%  of  partic¬ 
ipants  understood  that  applications  continue  to  run  when  not 
visible  and  that  they  have  the  same  access  to  sensitive  user 
data  that  they  do  when  actively  being  used  [37].  This  means 
that  the  majority  of  users  incorrectly  believe  that  applications 
either  stop  running  when  in  the  background  or  lose  the  ability 
to  access  sensitive  data  altogether.  Wijesekera  et  al. 
corroborated  this  observation  in  a  field  study  of  users’ 
privacy  expectations:  users  are  more  likely  to  deem 
permission  requests  from  background  applications  as  being 
inappropriate  or  unexpected,  and  indicate  a  desire  to  regu¬ 
late  applications’  access  to  sensitive  data  based  on  whether 
or  not  those  applications  are  in  use  [38]. 

In  the  default  permission  manager,  users  cannot  vary  their 
decisions  based  on  the  visibility  of  the  requesting  applica¬ 
tion,  or  any  other  contextual  factors.  Our  overarching  goal  is 
to  empower  users  to  make  contextual  decisions  (i.e.,  based  on 
what  they  were  doing  on  the  device)  and  to  apply  these 
decisions  to  future  use  cases,  so  that  fewer  decisions  need  to 
be  explicitly  made  overall.  As  a  first  step  towards  allowing 
users  to  make  contextual  decisions,  TurtleGuard  makes  de¬ 
cisions  about  whether  or  not  to  allow  or  deny  access  based 
on  whether  the  requesting  application  is  actively  being  used. 
While  this  is  but  one  contextual  factor  amongst  many,  it  is 
likely  one  of  the  most  important  factors  [38]. 

Moving  one  step  beyond  the  all-or-nothing  approach  of  al¬ 
lowing  or  denying  an  application’ s  access  to  a  particular  data 
type,  our  new  design  gives  the  user  a  third  option:  allowing 
applications  to  access  protected  data  only  when  in  use 
(Table  1  and  Figure  2).  When  the  when  in  use  option  is  se¬ 
lected,  the  platform  only  allows  an  application  to  access  a 
resource  if  the  application  is  running  in  such  a  way  that  it 


option  meaning 

always  The  permission  is  always  granted  to  the  re¬ 

questing  application,  regardless  of  whether 
the  application  is  running  in  the  fore¬ 
ground  or  background. 

when  in  use  The  permission  is  granted  to  the  requesting 
application  only  when  there  are  cues  that 
the  application  is  tunning,  and  denied  when 
the  application  is  running  invisibly  in  the 
background. 

never  The  permission  is  never  granted  to  the  re¬ 

questing  application. 


Table  1:  The  three  possible  permission  settings  under 
TurtleGuard.  The  when  in  use  option  accounts  for  the  visi¬ 
bility  of  the  requesting  app,  which  is  a  strong  contextual  cue. 

is  censpicueus  to  the  user  of  the  device.  We  consider  the 
following  behaviors  conspicuous:  (i)  the  application  is  run¬ 
ning  in  the  foreground  (i.e.,  the  user  is  actively  using  it),  (ii) 
the  application  has  a  notification  on  the  screen,  (iii)  the  ap¬ 
plication  is  in  the  background  but  is  producing  audio  while 
the  device  is  unmuted.  If  these  conditions  do  not  hold,  then 
access  to  the  resource  is  denied. 

3.2  Auditing  Automatic  Decisions 

Although  Android  currently  provides  an  interface  to  list  the 
applications  that  recently  accessed  location  data,  similar  in¬ 
formation  is  unavailable  for  other  protected  resources.  The 
existing  Android  interface  also  does  not  differentiate  be¬ 
tween  actions  that  applications  take  when  in  use  and  when 
not  in  use.  TurtleGuard’ s  main  design  objective  is  therefore 
to  communicate  the  types  of  sensitive  data  that  have  been 
accessed  by  applications  and  under  what  circumstances. 

Our  initial  design  of  TurtleGuard  can  be  seen  in  Figure  2. 
The  first  tab  (activity)  shows  all  of  the  recently  allowed  or 
denied  permission  requests,  including  when  those  requests 
occurred  and  whether  the  application  was  in  use  at  the  time. 
TurtleGuard  presents  this  information  as  a  running 
timeline — a  log  sotted  chronologically.  A  second  tab  lists  all 
of  the  apps  installed  on  the  phone  in  alphabetical  order,  al¬ 
lowing  the  user  to  examine  what  decisions  have  been  made 
for  all  permissions  requested  by  a  particular  app.  The  user 
can  expand  a  log  entry  to  change  future  behavior,  if  the  plat¬ 
form’s  automated  decision  to  allow  or  deny  a  permission  did 
not  align  with  the  user’s  preferences.  When  the  user  uses  this 
interface  to  change  a  setting,  the  classifier  is  retrained  based 
on  the  updated  information. 

3.3  Permission  Families 

Android  uses  over  100  permissions  and  a  given  resource  can 
have  more  than  one  related  permission.  Felt  et  al.  found  that 
not  all  the  permission  types  warrant  a  runtime  prompt — it 
depends  on  the  nature  of  the  resource  and  the  severity  of  the 
threat  [9].  Consequently,  TurtleGuard  only  manages  a  subset 
of  permissions  (Table  5,  Appendix)  based  on  those  deemed 
sensitive  by  prior  work  and  by  the  latest  Android  version.  In 
the  first  prototype  of  TurtleGuard,  we  had  listed  the  original 
names  of  the  permissions,  ungrouped.  One  of  the  changes  we 
made  as  we  iterated  on  our  design  after  our  pilot  experiment 
was  to  implement  permission  “families”.  For 
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Figure  2:  The  pilot  design  of  TurtleGuard  listed  recent  app 
activity  (top  left),  a  list  of  installed  apps  and  their  associated 
permissions  (top  right).  Permissions  can  be  always  granted, 
granted  only  when  in  use,  or  ne\/er  granted  (bottom). 


the  first  experiment  to  examine  our  initial  prototype,  as  de¬ 
scribed  in  the  previous  section.  Based  on  the  analysis  of  our 
first  experiment,  we  made  changes  to  our  design,  and  then 
validated  those  changes  through  a  second  experiment.  In 
both  experiments,  we  asked  participants  to  perform  four 
different  tasks  using  an  interactive  Android  simulation. 
These  tasks  involved  either  retrieving  information  about  an 
appli-  cation’s  prior  access  to  sensitive  resources  or 
preventing  ac-  cess  in  the  future  (i.e.,  modifying  settings). 
Our  study  was  approved  by  our  IRB  (#2013-02-4992). 

In  both  experiments,  we  randomly  assigned  participants  to 
either  the  control  or  experimental  conditions.  We  presented 
control  participants  with  an  interactive  HTML5  simulation 
of  the  default  permission  manager,  which  is  accessible  from 
within  the  Settings  app.  We  presented  experimental  partic¬ 
ipants  with  an  interactive  HTML5  simulation  of  our  novel 
permission  manager,  TurtleGuard.  During  our  pilot  exper¬ 
iment,  TurtleGuard  was  accessible  through  an  icon  on  the 
home  screen  labeled  “Privacy  Manager,”  though  we  added  it 
as  a  sub-panel  to  the  Settings  app  prior  to  the  validation 
experiment  (Figure  6  in  the  Appendix).  The  questions  and 
tasks  for  participants  were  identical  for  the  two  conditions 
and  both  experiments. 


4.1  Tasks 

We  presented  participants  with  four  tasks  to  complete  using 
the  interactive  Android  simulations:  three  tasks  to  retrieve 
information  about  permission  settings,  and  one  task  to  mod¬ 
ify  permission  settings.  Some  of  these  tasks  required  partic¬ 
ipants  to  find  information  about  a  specific  app’s  abilities.  In 
order  to  avoid  biases  from  participants’  prior  experiences  and 
knowledge  of  specific  real-world  apps,  these  questions 
instead  focused  on  a  fictitious  app,  Zv\N.  While  we  random¬ 
ized  the  order  of  the  tasks,  we  ensured  that  Task  3  always 
came  before  Task  4  (i.e.,  we  never  asked  them  to  prevent 
background  location  data  collection  prior  to  asking  them 
whether  background  location  data  was  even  possible).  After 
each  task,  we  asked  participants  to  rate  the  difficulty  of  the 
task  using  a  5-point  Likert  scale  (“very  easy”  to  “very  dif¬ 
ficult”).  Finally,  upon  completing  all  tasks,  we  asked  them 
several  demographic  questions  and  then  compensated  them 
$2.  We  now  describe  the  four  tasks  in  detail. 


example,  read  contacts  _and  write  contacts  are  grouped 
into  a  single  contacts  permission  family.  This  means  that 
within  TurtleGuard,  users  only  see  the  human-  readable 
resource  type  and  not  the  underlying  permissions  the 
family  manages.  Any  changes  that  a  user  makes  about 
granting  a  resource  therefore  affects  all  permissions  in  the 
same  family.  For  example,  there  is  no  longer  a  distinction 
between  coarse  and  fine  location  data;  both  are  either  al¬ 
lowed  or  denied  by  a  location  settings  change  made  using 
the  TurtleGuardinterface. 

4.  METHODOLOGY 

We  conducted  two  online  experiments  to  evaluate  the  effec¬ 
tiveness  of  TurtleGuard  at  providing  users  with  information 
and  control  over  app  permissions,  as  compared  to  Android’s 
default  permission  manager  (as  of  versions  6.0).  We  designed 


Task  1:  What  were  the  two  most  recent  applications 
that  accessed  this  device’s  location? 

In  this  task,  we  asked  participants  to  use  the  Android  sim¬ 
ulation  and  identify  the  two  applications  that  most-recently 
accessed  location  data.  Participants  used  two  open-ended 
fields  to  answer  this  question.  In  the  control  condition,  this 
task  was  correctly  accomplished  by  navigating  to  the  “loca¬ 
tion”  screen  from  within  the  Settings  application  (Figure  3). 
This  screen  presents  information  about  applications  that  re¬ 
cently  requested  location  data. 

In  the  experimental  condition,  this  task  was  accomplished  by 
simply  studying  the  “activity”  screen,  which  was  displayed 
immediately  upon  opening  TurtleGuard  (Figure  2).  Given 
that  this  task  was  already  supported  by  the  default  permis¬ 
sion  manager,  we  wanted  to  verify  that  TurtleGuard  per¬ 
formed  at  least  as  well. 
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Task  2:  Currently,  which  of  the  following  data  types 
can  be  accessed  by  the  ZvW application? 

In  the  control  condition,  this  was  accomplished  by  perform¬ 
ing  the  four  steps  to  access  the  screen  in  Figure  4  (right): 
selecting  the“Apps”panel  within  the  Settings  app  (Figure  3, 
left),  selecting  the  ZvW  application,  and  then  selecting  the 
“Permissions.”  This  screen  depicted  a  list  of  permissions 
available  to  the  application  based  on  what  the  application 
declares  as  its  required  permissions;  the  user  is  able  to  fine- 
tune  this  by  selectively  disabling  certain  permissions  using 
the  sliders  on  this  screen.  We  wanted  participants  to  iden¬ 
tify  the  permissions  that  were  enabled,  rather  than  all  of 
those  that  COUidhe  enabled  in  the  future. 

In  the  experimental  condition,  participants  could  accomplish 
this  task  by  selecting  the  “Apps”  tab  from  within  Turtle- 
Guard  and  then  expanding  the  ZvW  application  to  view  its 
requested  permissions  (Figure  2,  top  right).  In  both  condi¬ 
tions,  the  correct  answer  to  the  question  was  that  “location” 
is  the  only  data  type  that  can  be  accessed  by  the  ZvW  ap¬ 
plication.  Again,  given  that  this  task  was  already  supported 
by  the  default  permission  manager,  we  wanted  to  verify  that 
TurtleGuard  performed  at  least  as  well. 

Task  3:  Is  the  ZvW  application  able  to  access  location 
data  when  it  is  not  being  actively  used? 

We  designed  this  task  to  determine  whether  TurtleGuard 
was  effective  at  communicating  to  participants  in  the  ex¬ 
perimental  condition  the  difference  between  foreground  and 
background  data  access.  Similarly,  we  wanted  to  examine 
whether  participants  in  the  control  condition  understood 
that  once  granted  a  permission,  an  application  may  access 
data  while  not  in  use.  Based  on  the  settings  of  the  simula¬ 
tions,  the  correct  answer  across  both  conditions  was  “yes.” 

Participants  in  the  control  group  must  navigate  to  Settings, 
then  the  “Apps”  panel,  and  view  the  list  of  permissions  cor¬ 
responding  to  the  ZvW  application,  similar  to  Task  2.  Lo¬ 
cation  is  turned  on,  and  so  participants  must  be  able  to 
understand  that  this  means  that  the  permission  is  granted  even 
when  it  is  not  actively  being  used.  Participants  in  the 
experimental  condition  can  use  TurtleGuard’ s  “Apps”  tab  to 
view  the  requested  permissions  for  the  ZvW  application.  This 
shows  that  the  location  permission  is  listed  as  “always” 
(Figure  2,  top  right)  and  that  “when  in  use”  is  an  unselected 
option  (Figure  2,  bottom). 

Task  4:  Using  the  simulation,  prevent  ZvW  from  being 
able  to  access  your  location  when  you  aren’t  actively 
using  ZvW  (i.e.,  it  can  still  access  location  data  when 
it  is  being  used).  Please  describe  the  steps  you  took  to 
accomplish  this  below,  or  explain  whether  you  believe 
this  is  even  possible. 

As  a  follow-up  to  the  third  task,  the  fourth  task  involved 
participants  explaining  the  steps  that  they  went  through  in 
order  to  limit  background  location  access,  or  to  explain  that 
it  is  notpossible. 

Those  in  the  experimental  condition  could  locate  and  change 
this  permission  setting  either  through  the  activity  timeline  or 
by  locating  ZvW  from  the  “Apps”  tab  (Figure  2).  We  marked 
answers  correct  that  specifically  mentioned  changing  the 
setting  to  “when  in  use.” 


Those  in  the  confro/ condition  could  not  prevent  this  access. 
We  marked  responses  correct  if  they  indicated  that  this  task 
was  impossible  to  complete.  Two  coders  independently  re¬ 
viewed  the  responses  to  this  task  (Cohen’/s  =0.903).  The 
objective  of  this  task  was  to  see  TurtleGuard’ s  success  at 
allowing  participants  to  vary  settings  based  on  application 
use  (a  strong  contextual  cue)  and  to  examine  whetherpar- 
ticipants  knew  that  this  was  not  possible  when  using  the 
default  permission  manager. 

4.2  UI  Instrumentation 

We  built  an  interactive  HTML5  simulation  of  the  UI  designs 
described  in  the  previous  section  using  proto.io.  We  instru¬ 
mented  the  simulations  to  log  all  interactions  (e.g.,  panels 
visited,  buttons  clicked,  etc.).  This  data  allowed  us  to  ana¬ 
lyze  how  participants  navigated  the  UI  to  perform  each  task. 

4.3  Qualitative  Data 

In  addition  to  analyzing  the  participants’  responses  to  the 
four  tasks,  their  perceived  difficulty  of  each  of  the  tasks,  and 
their  demographic  information,  we  also  collected  responses 
to  two  open-ended  questions: 

Thinking  about  the  tasks  that  you  performed  in  this  sur¬ 
vey,  have  you  ever  wanted  to  find  similar  information 
about  the  apps  running  on  your  smartphone? 

We  coded  participants’  responses  as  a  binary  value.  Re¬ 
sponses  indicating  sentiments  such  as  “yes”  and  “I  always 
wanted  thaf’were  coded  as  true.  Clear  negative  answers  and 

weak  affirmative  answers  such  as  “sometimes”  and  “maybe” 
were  coded  as  false.  The  purpose  of  this  question  is  to  see 
how  prevalent  seeking  information  is  in  the  real  world. 

Thinking  about  the  simulation  that  you  just  used,  what 
could  be  done  to  make  it  easier  to  find  information 
about  how  apps  access  sensitive  information? 

We  coded  participants’  responses  in  multiple  ways.  First, 
as  binary  values  indicating  contentment  with  the  presented 
design.  Responses  that  affirmed  that  the  user  would  change 
nothing  about  the  presented  design  were  coded  as  true.  Any 
complaints  or  suggestions  were  coded  as  false,  as  well  as  re¬ 
sponses  with  uncertainty,  confusion,  or  ambivalence  (e.g. ,  “I 
don’t  know”).  We  further  coded  responses  that  had  specific 
suggestions,  using  tags  for  the  different  themes. 

Each  response  was  coded  by  two  experienced  coders  working 
independently,  who  then  compared  responses  and  recorded 
their  coding  conflicts.  The  coders  discussed  and  reconciled 
the  differences  using  their  mutually  agreed  upon  stricter  in¬ 
terpretation  given  the  nature  of  the  tasks.  This  produced  the 
final  coding  of  the  data,  which  is  used  in  our  analysis. 

5.  PILOT  EXPERIMENT 

Using  the  methodology  outlined  in  the  previous  section,  we 
recruited  400  participants  from  Amazon’s  Mechanical  Turk 
for  a  pilot  experiment.  We  discarded  8  incomplete  sets  of 
responses,  leaving  us  with  392  participants.  Our  sample  was 
biased  towards  male  respondents  (65%  of  392),  however,  a 
chi-square  test  indicated  no  significant  differences  between 
genders  with  regard  to  successfully  completing  each  task. 
Disclosed  ages  ranged  from  19  to  69,  with  an  average  age  of 
33.  In  the  remainder  of  this  section,  we  describe  our  results 
for  each  task,  and  then  describe  changes  we  made  to 
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Figure  3;  In  Task  1,  participants  in  the  control  condition 
could  identify  the  most  recent  applications  that  requested 
location  data  from  within  the  Settings  application.  This  was 
also  a  valid  method  for  Task  1  in  the  experimental  condition 
for  the  validation  study. 


TurtleGuard’s  interface  as  a  result  of  this  initial  experiment. 
We  note  that  in  our  simulation,  Settings  can  only  be  reached 
by  tapping  on  the  icon  from  the  home  screen.  In  all  of  our 
tasks,  we  also  asked  participants  to  evaluate  perceived 
difficulty  using  a  5-point  Likert  scale. 

5.1  Task  1:  Recent  Location  Access 

In  the  confro/ condition,  84%  of  participants  (167  out  of  198) 
correctly  completed  this  task,  whereas  only  68%  (132  out  of 
194)  completed  it  correctly  in  the  exper/menfa/ condition. 
This  difference  was  statistically  significant  (X^=  14.391,  p 
<  0.0005),  though  with  a  small-to-medium  effect  size  (c/ 
=  0.192).  In  both  cases,  answers  were  marked  correct  if 
they  mentioned  both  the  Browser  and  ZvW  applications 
(Table  2).  Of  the  49  participants  in  the  experimental  group 
who  tried  but  failed,  1 3  never  opened  TurtleGuard,  and  over 
73%  (36  of  49)  entered“Browser”and“Contacts”,  which  were 
the  first  two  applications  listed  in  the  activity  tab  of  the  Per¬ 
mission  Manager.  The  activity  tab  showed  recent  resource 
accesses  in  a  chronological  order — “Browser”  had  been  de¬ 
nied  a  location  request  and  “Contact”  had  successfully  ac¬ 
cessed  call  logs. 

Participants  did  not  seem  to  understand  that  the  activity  log 
presented  entries  related  to  all  sensitive  data  types,  not  just 
location  data.  This  confusion  might  also  stem  from  their  fa¬ 
miliarity  with  the  location  access  panel  in  stock  Android,  in 
which  location  access  requests  are  presented  in  chronologi¬ 
cal  order.  We  hypothesize  that  this  confusion  is  addressable 
by  redesigning  the  activity  log  to  better  distinguish  between 
data  types  and  alio  wed- versus-denied  permission  requests. 
One  possible  way  of  implementing  this  is  to  create  separate 
tabs  for  allowed  and  denied  requests,  as  well  as  to  group 
similar  data  types  together  (rather  than  presenting  all  per¬ 
mission  request  activity  in  chronological  order). 


Condition 

Correct 

Incorrect 

Task  1 
control 
experimental 

167  (84%) 
132  (68%) 

31  (15%) 
62  (32%) 

Task  2 
control 
experimental 

140  (70%) 
116(59%) 

58  (29%) 
78  (40%) 

Task  3 
control 
experimental 

86  (43%) 
153  (78%) 

112(56%) 
41  (21%) 

Task  4 
control 
experimental 

47  (23%) 
144  (75%) 

151  (76%) 
49  (25%) 

Table  2;  Participants  in  each  condition  who  performed  each 
task  correctly  during  the  pilot  experiment. 
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Figure  4:  In  Task  2,  participants  in  the  control  condition 
could  identify  the  permissions  granted  to  the  ZvW  applica¬ 
tion  by  selecting  the  “Apps”  panel  from  within  the  Settings 
application,  and  then  selecting  the  application,  followed  by 
the  “Permissions”  panel. 


5.2  Task  2:  Finding  Granted  Permissions 

In  the  second  task,  we  asked  participants  to  list  all  of  the  data 
types  that  the  ZvW  application  currently  had  access  to.  We 
observed  that  140  participants  in  the  control  condition 
(70.7%  of  198)  and  116  participants  in  the  experimental 
condition  (59.8%  of  194)  performed  this  task  correctly.  After 
correcting  for  multiple  testing,  this  difference  was  not 

statistically  significant  (x^  =  5.151,  P  <  0.023). 

However,  despite  the  lack  of  statistical  significance,  we  were 
surprised  that  not  more  people  in  the  experimental  condi¬ 
tion  answered  correctly.  Upon  investigating  further,  we  no¬ 
ticed  several  confounding  factors  that  might  have  made  this 

task  more  difficult  for  people  in  this  condition.  First,  while 
the  control  condition  displays  the  currently-allowed  permis¬ 
sions  as  grayed-out  text  on  the  “App  Info”  page  (Figure  4), 
the  experimental  condition  lists  all  reqtvesfed  permissions — 
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which  is  a  superset  of  the  allowed  permissions  (top-right  of 
Figure  2).  Second,  we  noticed  that  due  to  an  experimental 
design  error,  the  permissions  requested  by  the  ZvW  app  in 
the  experimental  condition  included  several  that  were  not  in¬ 
cluded  in  the  options  presented  to  participants  (e.g.,  “Write 
Contacts”  and  “Read  Call  Log”).  This  may  have  made  this 
task  confusing  for  these  participants. 

5.3  Task  3:  Finding  Background  Activity 

In  the  third  task,  we  asked  participants  whether  the  ZvW 
application  had  the  ability  to  access  location  data  while  not 
actively  being  used.  We  observed  that  86  participants  in  the 
control  condition  (43%  of  198)  correctly  answered  this 
question,  as  compared  to  153  participants  in  the  experimen¬ 
tal  condition  (78%  of  194).  This  difference  was  statistically 
2 

significant  {x  =  5 1.695,/?  <  0.0005)  with  a  medium  effect 
size  (c/  =  0.363).  Thus,  the  new  dashboard  interface  suc¬ 
cessfully  differentiated  between  foreground  and  background 
permission  usage. 

5.4  Task  4:  Limiting  Background  Activity 

We  observed  that  only  47  participants  in  the  control  con¬ 
dition  (23%  of  198)  correctly  stated  that  this  task  was  im¬ 
possible.  In  the  experimental  condition,  144  (74%  of  193)^ 
clearly  articulated  the  steps  that  they  would  go  through  us¬ 
ing  the  privacy  dashboard  to  change  location  access  from 
“always”  to  “when  in  use.”  This  difference  was  statistically 

significant  (x^  =  101.234,  p  <  0.0005)  with  a  large  effect  size 
(c/=  0.509). 

5.5  Design  Changes 

Based  on  the  results  of  our  first  two  tasks,  in  which  partici¬ 
pants  in  the  control  condition  were  more  likely  to  correctly 
locate  information  about  recent  app  activities  and  the  per¬ 
missions  that  apps  had  requested,  we  made  several  design 
changes  to  the  TurtleGuard  interface.  First,  we  split  the  ac¬ 
tivity  timeline  into  two  separate  tabs:  recently  allowed  per¬ 
mission  requests,  and  recently  denied  permission  requests. 
Second,  rather  than  showing  all  activity  in  chronological 
order,  the  activity  timeline  is  now  categorized  by  resource 
type,  with  the  events  for  each  resource  type  sorted  chrono¬ 
logically.  These  changes  can  be  seen  in  the  top  of  Figure  5. 

In  addition  to  these  changes,  we  also  modified  the  apps  tab 
to  show  grayed-out  allowed  permissions  for  each  app,  similar 
to  the  App  Info  panel  in  the  default  permission  manager.  Due 
to  the  error  we  noted  in  the  experimental  condition  in  Task 
2,  we  made  sure  that  all  app  permissions  were  the  same  in 
both  conditions. 

Finally,  we  moved  TurtleGuard  to  be  within  the  Settings  app, 
so  that  it  appears  as  a  panel  labeled  “Permissions  Man-  ager” 
(Figure  6,  Appendix).  For  consistency,  when  participants  in 
the  experimental  condition  select  the  “Permissions”  sub¬ 
panel  from  within  the  “App  Info”  panel  (Figure  4,  left),  they 
are  now  redirected  to  TurtleGuard’ s  “Apps”  panel,  pre¬ 
opened  to  the  app  in  question  (Figure  5,  bottom  right). 

6.  VALIDATION  EXPERIMENT 

Following  our  pilot  experiment  and  subsequent  design  changes, 
we  performed  a  validation  experiment.  In  the  remainder  of 
this  section,  we  discuss  our  results  (Table  3). 

^One  person  could  not  load  the  if  ramecontaining  the  sim¬ 
ulation  during  this  task. 
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Figure  5:  TurtleGuard  separates  recently  allowed  (top  left) 
and  denied  (top  right)  permissions.  The  “Apps”  tab  lists  the 
allowed  permissions  of  all  apps  (bottom  left).  Expanding  an 
app  allows  the  user  to  make  changes  (bottom  right). 


6.1  Participants 

Because  of  several  known  biases  in  Mechanical  Turk’s  de¬ 
mographics  [27,  33,  22],  we  decided  to  compare  a  sample  of 
298  Mechanical  Turk  participants  to  a  sample  of  300  Pro¬ 
lific  Academic  participants.  Peer  et  al.  recently  performed 
several  studies  on  various  crowdsourcing  platforms  and  con¬ 
cluded  that  the  latter  yields  more  diverse  participants  [30]. 
We  limited  both  groups  to  participants  based  in  the  U.S.,  over 
18,  owning  an  Android  phone,  and  having  a  95%  approval 
rating  on  their  respective  platform.  After  removing  18 
incomplete  responses,  we  were  left  with  a  combined  sample 
of  580  participants.  We  analyzed  the  results  from  the  two 
groups,  and  discovered  that  the  high-level  findings  (i.e.. 
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Condition 

Correct 

Incorrect 

Task  1 
control 
experimental 

237  (82.6%) 
241  (82.5%) 

50  (17.4%) 
52  (17.5%) 

Task  2 
control 
experimental 

232  (77.1%) 
226  (80.8%) 

55  (22.9%) 

67  (19.2%) 

Task  3 
control 
experimental 

108  (37.6%) 
230  (78.5%) 

179  (62.4%) 
63  (21.5%) 

Task  4 
control 
experimental 

79  (27.5%) 
224  (76.5%) 

208  (72.5%) 
69  (23.5%) 

Table  3;  Participants  in  each  condition  who  performed  each 
task  correctly  during  the  validation  experiment. 


task  performance)  did  not  observably  differ.  For  the  remain¬ 
der  of  our  study,  we  therefore  discuss  the  combined  results. 
Our  sample  was  biased  towards  male  respondents  (63%  of 
580),  however,  a  chi-square  test  indicated  no  significant  dif¬ 
ferences  between  genders  with  regard  to  successfully  com¬ 
pleting  each  task.  Disclosed  ages  ranged  from  19  to  74,  with 
an  average  age  of  33.  Participants  performed  the  same  tasks 
as  those  in  the  pilot  experiment  and  took  on  average  9  min¬ 
utes  and  17  seconds  to  complete  the  experiment. 

6.2  Task  1:  Recent  Location  Access 

Recall  that  in  this  task,  we  asked  participants  to  identify  the 
two  most  recent  applications  that  accessed  location  data.  For 
the  experimental  condition,  in  addition  to  using  the  same 
method  as  the  control  (navigating  to  the  “Location”  sub-panel 
of  the  Settings  app),  participants  could  navigate  to  the 
“Allowed”  tah  within  TurtleGuard,  and  then  examine 
the“Location”permission  to  see  the  two  most  recent  accesses 
(top  left  of  Figure  5).  In  the  control  condition,  237  partici¬ 
pants  (82.6%  of  287)  correctly  completed  this  task,  whereas 
241  (82.5%  of  293)  completed  it  correctly  in  the  experimental 
condition.  A  chi-square  test  revealed  that  this  difference 

2 

was  not  statistically  significant  (x  =  0.01  l,p  <  0.918). 

We  observed  that  most  of  the  participants  in  both  conditions 
used  the  default  method  of  accomplishing  this  task  (i.e., 
accessing  the  Location  sub-panel):  80.1%  of  those  who 
answered  correctly  in  the  experimental  condition  and  92.8% 
of  those  in  the  control  condition.  Fifteen  participants  in  the 
control  condition  answered  correctly  despite  not  accessing 
the  panel — likely  by  random  guessing,  and  two  who  an¬ 
swered  correctly  hy  exhaustively  searching  the  “App  Info” 
panels  of  installed  apps,  to  see  which  had  heen  granted  the 
location  permission;  48  participants  in  the  experimental  con¬ 
dition  used  TurtleGuard  to  yield  the  correct  answer. 

A  total  of  1 02  participants  incorrectly  answered  the  question 
in  Task  1 .  Of  the  incorrect  responses,  five  participants  failed 
to  properly  navigate  the  simulation  and  wrote  that  it  was 
broken  or  the  buttons  did  not  work;  9  participants  did  not 
respond  or  wrote  that  they  did  not  know.  Of  the  other  88 
participants,  38  (43%)  listed  “App  Store”  as  one  of  their 
selections,  making  it  the  most  common  error. 


More  specifically,  18  participants  listed  their  answers  as  hoth 
“App  Store”  and  “Browser.”  We  believe  that  this  is  because 
both  the  stock  Android  Apps  Manager  and  TurtleGuard’ s 
“Apps”  tab  (Figure  5,  bottom)  sort  the  entries  alphabetically, 
and  by  looking  at  the  permissions  available  to  both  of  these 
apps,  participants  would  see  that  hoth  have  location  access. 
Nevertheless,  they  are  not  the  most  recent  apps  to  access 
location  data. 

Overall,  these  results  suggest  that  the  changes  we  made  af¬ 
ter  our  pilot  experiment  resulted  in  marked  improvements. 
We  further  investigated  this  by  examining  participants  ’  per¬ 
ceived  ease-of-use,  as  measured  using  the  5-point  Likert  scale 
(“very  easy  (1)”  to  “very  difficult  (5)”).  In  the  experimen¬ 
tal  condition,  84  participants  accessed  TurtleGuard  to  com¬ 
plete  this  task  (regardless  of  whether  or  not  they  answered 
correctly).  We  compared  these  84  responses  with  the  463 
responses  from  participants  who  only  used  the  default  Set¬ 
tings  panel  (i.e.,  195  in  the  experimental  group  and  268 
in  the  control  group).  The  median  responses  from  both 
groups  was  “easy”  (2),  however  there  was  a  statistically  sig¬ 
nificant  difference  between  the  groups  (Wilcoxon  Rank-Sum 
test:  Z  =  3t9605,  p  <  0.0005),  with  a  small  effect  size  (r 
=  0.17) — participants  who  used  TurtleGuard  found  it  more 
difficult  compared  to  the  default  Settings  panel.  This 
difference  appears  to  be  due  to  those  who  performed  the  task 
incorrectly :  the  median  response  for  T urtleGuard  users  who 
answered  incorrectly  was  “difficult  (4),”  whereas  it  was  “neu¬ 
tral  (3)”  for  other  participants.  This  may  actually  be  a  good 
thing:  participants  who  confidently  answered  incorrectly  are 
at  greater  risk  due  to  overconfidence,  whereas  those  who  had 
difficulty  may  be  more  likely  to  seek  out  more  information. 

6.3  Task  2:  Finding  Granted  Permissions 

In  this  task,  participants  had  to  locate  the  app’s  allowed 
permissions  to  discover  that  “location”  was  the  only  allowed 
permission  in  both  the  experimental  and  control  conditions. 
This  could  be  accomplished  by  viewing  TurtleGuard’ s  Apps 
tab  (bottom  of  Figure  5)  for  those  in  the  experimental  con¬ 
dition,  or  hy  viewing  an  app’s  App  Info  panel  from  within 
the  Settings  app  (Figure  4),  which  was  available  to  those  in 
either  condition. 

In  total,  458  participants  correctly  performed  this  task  (79% 
of  580).  Table  3  displays  the  breakdown  of  the  results  by 
condition.  A  chi-square  test  did  not  yield  statistically  sig¬ 
nificant  results  between  the  two  conditions  in  terms  of  task 
completion  (X^  =  0.984,  p  <  0.321). 

Of  the  226  experimental  condition  participants  who  per¬ 
formed  the  task  correctly,  127  (56.2%)  did  so  by  using  Turtle- 
Guard.  In  total,  145  experimental  condition  participants  ac¬ 
cessed  TurtleGuard,  and  reported  a  median  task  difficulty  of 
“easy  (2).”  This  did  not  significantly  differ  from  the  375  other 
participants  in  both  conditions  who  only  examined  the 
default  Settings  panels  to  perform  the  task  and  also  reported 
a  median  difficulty  of  “easy”  (Z  =  1 .808,  p  <  0.238);  60  par¬ 
ticipants  never  opened  Settings  (10  of  whom  answered  the 
question  correctly,  likely  due  to  random  guessing). 

6.4  Task  3:  Finding  Background  Activity 

To  perform  this  task,  participants  in  the  control  group  had  to 
navigate  to  Settings,  then  the  “Apps”  panel,  and  view  the  list 
of  permissions  corresponding  to  the  ZvW  application 
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(Figure  4).  However,  performing  this  sequence  of  steps  still 
did  not  guarantee  they  would  answer  the  question  correctly; 
they  needed  to  observe  that  location  data  was  allowed,  as 
well  as  understand  that  this  meant  that  location  data  could  be 
accessed  by  the  app  even  when  it  is  not  actively  being  used. 
Participants  in  the  experimental  condition  answered  this 
question  through  TurtleGuard,  which  shows  that  the  lo¬ 
cation  permission  was  listed  as  “Always”  (Figure  5),  thereby 
eliminating  the  ambiguity. 

We  observed  that  230  experimental  condition  participants 
answered  this  question  correctly  (78.5%  of  293),  as  com¬ 
pared  to  only  108  confro/ participants  (37.6%  of  287).  A 
chi-square  test  showed  that  this  difference  was  significant 
(X=  97.914,  p  <0.0005)  with  a  medium-to-large  effect  size 
(c/  =  0.414).  This  observation  corroborates  Thompson  et 
al.’s  findings  [37]  that  users  are  largely  unaware  thatapps 
can  access  sensitive  data  when  not  in  use.  TurtleGuard, 
however,  was  more  effective  at  communicating  this  informa¬ 
tion  to  participants.  Among  the  participants  in  the  experi¬ 
mental  condition,  24.57%  took  the  extra  step  to  click  on  the 
location  entry  (bottom  right  of  Figure  5)  to  see  the  other  op¬ 
tions  available  (Figure  2):  always,  when  in  use,  and  never. 

We  found  that  129  participants  used  TurtleGuard  to  per-  form 
this  task,  which  suggests  that  101  (34.5%  of  experimental 
condition  participants)  still  got  it  correct  either  based  on  prior 
knowledge — a  proportion  consistent  with  Thompson  et  al.’s 
findings  [37] — or  after  having  used  TurtleGuard  in  pre¬ 
ceding  tasks.  There  were  383  participants  who  completed  the 
task  by  examining  existing  areas  of  the  Settings  app,  whereas 
68  participants  never  bothered  to  open  Settings  to  complete 
this  task.  The  median  ease  of  use  for  those  who  used 
TurtleGuard  was  “easy  (2)”,  while  the  median  ease  of  use  for 
those  who  used  the  default  permission  manager  was  “neutral 
(3)”.  This  difference  was  statistically  significant  (Z  =  2.885, 
p  <  0.004)  with  a  small  effect  size  (r  =  0.13).  Participants  in 
the  control  condition  also  took  significantly  longer  to 
complete  the  task:  49.63  seconds  versus  26.65  seconds.  A 
Wilcoxon  Rank-Sum  test  found  this  difference  to  be 
statistically  significant  (Z  =  —5.239,  p  <0.0005,  r=  0.22). 

6.5  Task  4:  Limiting  Background  Activity 

Task  4  asked  participants  to  describe  the  steps  to  prevent  an 
application  from  accessing  location  data  while  the  applica¬ 
tion  was  not  in  use,  or  to  state  that  it  is  not  possible  to  pre¬ 
vent  it.  It  is  only  possible  to  prevent  it  using  TurtleGuard. 

In  the  experimental  condition,  224  (76.5%  of  293)  explic-  itly 
stated  how  they  would  use  TurtleGuard  to  change  the 
permission  to  “when  in  use”,  whereas  only  79  (27.5%  of 
287)  control  group  participants  correctly  stated  that  this  task 
was  impossible  using  the  default  permission  manager.  This 
difference  was  statistically  significant  (x^  =  137.14,  p  < 
0.0005)  with  a  large  effect  size  (c/=  0.49). 

A  majority  of  the  participants  (72.5%)  in  the  control  group 
incorrectly  believed  that  they  could  vary  their  decisions  based 
on  the  visibility  of  the  application.  The  most  common  re¬ 
sponses  involved  disabling  location  data  altogether,  prevent¬ 
ing  the  app  from  running,  or  restricting  “background  data”: 


^We  used  a  very  conservative  rubric:  11  participants  who 
described  using  TurtleGuard,  but  did  not  explicitly  use  the 
phrase  “when  in  use,”  were  coded  as  being  incorrect. 


*  Settings  >  Apps  >  ZvW  >  Toggle  Location  0-<-' 

*  Disable  or  Force  Stop  the  Application 

*  Settings  >  Location  >  ZvW  >  Permissions  >  Toggle 
Location  O-sJ 

*  Settings  >  Apps  >  ZvW  >  Data  Usage  >  Restrict 
Background  Data 

*  Settings  >  Location  >  Toggle  Location  O 

A  considerable  portion  (14%)  chose  to  “restrict  background 
data,”  which  does  something  else  entirely:  it  prevents  data 
surcharges  while  roaming  on  foreign  networks.  This  is  an¬ 
other  example  of  a  disconnect  between  users’  mental  models 
and  the  true  meaning  of  these  configuration  options.  That 
said,  a  small  number  of  participants  in  the  control  condition 
correctly  stated  that  they  would  need  to  disable  the  app’s 
location  permission,  and  then  re-enable  it  every  time  they 
wanted  to  use  that  app,  a  tedious  process  that  is  prone  to 
forgetfulness — we  treated  this  response  as  correct.  Another 
substantial  portion  among  the  default  permission  manager 
condition  (46%)  wanted  to  block  the  location  globally  (from 
the  default  location  panel)  or  block  the  location  access  from 
ZvW  app  entirely.  While  this  is  an  overly  restrictive  option 
compared  to  when  in  use,  this  is  the  closest  option  provided 
in  Android — we  treated  this  as  an  incorrect  response. 

As  expected,  participants  in  the  control  condition  rated  the 
difficulty  of  this  task  as  “neutral  (3)”,  whereas  the  median 
Likert  score  from  those  in  the  experimental  condition  was 
“easy  (2)”.  This  difference  was  statistically  significant  with  a 
large  effect  size  (p  <  0.0005,  c/  =  0.49).  The  participants  in 
the  control  condition  who  successfully  completed  the  task 
(e.g.,  by  acknowledging  it  was  impossible)  struggled 
immensely  with  it,  evaluating  it  as  “difficult  (4)”. 

7.  USER  PERCEPTIONS 

After  completing  the  four  tasks,  participants  answered  two 
open-ended  questions  about  whether  they  have  looked  for 
this  type  of  permission  information  in  the  past,  and  whether 

they  have  any  suggestions  to  offer  us  about  the  design  of  the 
interface  they  had  just  used.  Two  researchers  independently 
coded  each  question  and  then  resolved  conflicts.  Weprovide 
Cohen’s  inter-rater  reliability  statistic  (  ^)  for  each  coding. 

7.1  Prior  Experiences 

Our  first  question  asked:  Thinking  about  the  tasks  that  you 
performed  in  this  survey,  have  you  ever  wanted  to  find  simi¬ 
lar  information  about  the  apps  running  on  your  smartphone  ? 

Our  goal  was  to  determine  whether  participants  had  pre¬ 
viously  thought  about  resource  access  or  configuring  pri¬ 
vacy  preferences,  and  whether  having  these  features  would 
be  beneficial.  On  average,  63.1%  of  participants  stated  that 
they  had  thought  about  this  (Cohen’s  k  =  0.192),  and  the 
experimental  condition  they  were  in  proved  to  be  insignif¬ 
icant.  We  did,  however,  observe  a  positive  correlation  be¬ 
tween  performance  on  the  four  tasks  (i.e.,  number  of  tasks 
performed  correctly)  and  reporting  having  previously  thought 
about  these  issues  (p<0. 007511,  /'=0.155). 

Among  the  people  who  chose  to  be  more  detailed  in  their 
responses,  several  themes  emerged.  A  large  portion  men¬ 
tioned  that  the  reason  they  had  tried  these  tasks  before  is  that 
they  wanted  to  be  able  to  exert  more  control  over  their 
installed  apps: 
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Changes  No  Changes 

control  245  (85.4%)  42  (14.6%) 

experimental  187  (63.8%)  106  (36.3%) 


T  able  4:  Whether  participants  believed  changes  were  needed 
to  the  interfaces  they  used  during  the  validation  study. 

•  “/  ivas  somewhat  familiarwith  these  menus  already  be¬ 
fore  starting  this  task.  I  like  to  have  oontrol  over  my 
app  permissions  inoluding  location  and  data  manage¬ 
ment.” 

•  “  Yes,  I’ve  often  wanted  a  little  more  control  over  what 
my  apps  get  to  access” 

A  minority  of  participants  expressed  their  frustrations  on 
how  the  current  default  user  interfaces  in  Android  were  con¬ 
fusing  and  did  not  let  them  set  privacy  preferences  the  way 
they  wanted: 

•  “  Yes  but  usually  can’t  find  anything  on  there  either  like 
these.  So  I  gave  up  trying.” 

•  “Yes.  I  want  to  know  what  they  collect,  although  it  gets 
tedious  to  try  to  figure  it  all  out.  Sometimes  I’d  rather 
just  ignore  it.” 

These  comments  highlight  the  fact  that  many  users  want  to 
have  control  over  resource  usage  hy  applications,  and  that 

many  feel  helpless  to  do  so,  given  the  options  offered  by 
current  privacy  management  interfaces.  These  observations 
further  strengthen  the  need  for  a  more  usahle  interface  that 
will  help  people  to  feel  more  empowered. 

7.2  Suggestions 

In  our  second  exit  survey  question,  we  asked:  Thinking  about  the 
simulation  that  you  just  used,  what  could  be  done  to  make  it 
easier  to  find  information  about  how  apps  access  sensitive 
information? 

This  question  had  two  purposes:  (i)  to  gather  specific  design 
recommendations  from  participants  who  used  TurtleGuard; 
(ii)  to  get  general  suggestions  from  participants  who  used  the 
default  permission  manager. 

In  total,  66.03%  participants  (383  of  580)  suggested  at  least 
some  change  or  improvement  (Cohen’s  k  =  0.896).  Table  4 
shows  the  breakdown  of  how  many  participants  in  each  con¬ 
dition  prefer  a  change  in  the  dashboard  within  their  condi¬ 
tion.  A  chi-square  test  shows  a  statistically  significant  as¬ 
sociation  between  a  participant’s  condition  and  whether  the 
participant  wants  changes  in  the  dashboard  (p  <  0.00005,  c/  = 
0.237).  This  suggests  the  participants  in  the  experimental 
condition  are  more  satisfied  with  the  controls  provided  by  the 
new  design  than  those  in  the  control  condition.  Our  work 
aims  to  fill  the  need  users  have  regarding  control  over 
permissions  and  their  personal  privacy. 

The  most  common  suggestion  (32.24%  of  all  suggestions) 
was  to  reduce  the  number  of  layers  to  the  actual  permis¬ 
sion  interface  (Cohen’^-  =  0.603).  Participants  complained 

about  number  of  different  interfaces  they  had  to  traverse 
before  reaching  the  actual  permission  interface.  Many  par¬ 
ticipants  suggested  that  they  would  prefer  to  reach  a  per¬ 


mission  control  interface  directly  through  the  application — 
either  as  part  of  the  application  or  by  pressing  the  app  icon. 
TurtleGuard  addresses  this  concern  by  providing  a  path  to 
permission  management  that  involves  fewer  clicks  and  cen¬ 
tralizes  all  permission  management  functionality. 

•  “Streamline  the  interface  to  require  less  touches  to  find 
the  information  about  permissions  and  make  it  explicit 
as  to  what  type  of  data  would  be  collected  if  allowed.” 

•  “Perhaps  have  an  easier  way  to  access  the  app’s  set¬ 
tings,  such  as  holding  onto  an  app’s  icon  will  bring  up 
its  specific  settings.” 

•  “Make  each  app  itself  have  the  option  to  find  that  infor¬ 
mation  instead  of  going  to  the  general  phone  settings.” 

•  “There  should  be  one  centralized  location,  or  maybe  an 
app  for  that.  Just  to  toggle  with  these  very  important 
settings.” 

Seven  participants  thought  having  a  log  of  recent  resource 
usage  by  applications  would  be  useful.  Some  went  further, 
mentioning  that  the  log  should  also  provide  contextual  cues, 
such  as  the  visibility  of  the  application  at  the  time  it  makes 
the  request.  This  finding  provides  evidence  in  support  of  Liu 
et  al.  [20],  that  recent  statistics  help  users  make  better  de¬ 
cisions.  TurtleGuard  provides  this  functionality  by  showing 
all  the  recent  resource  requests  along  with  (i)  the  decision 
that  platform  took  on  behalf  of  the  users,  (ii)  the  time  that  the 
decision  was  made,  and  (iii)  the  visibility  of  the  requesting 
application. 

•  “It  would  be  useful  to  have  a  dashboard  which  shows 
which  apps  are  accessing  what  and  when.  Being  able 
to  see  a  log  of  the  actual  data  that  was  accessed  would 
also  be  useful.” 

•  “A  log  could  be  provided  as  an  option  in  the  settings 
that  shows  all  times  an  app  accessed  sensitive  infor¬ 
mation.” 

A  few  participants  (14.6%)  also  suggested  that  there  should 
be  a  tutorial,  wizard  style  guide,  or  a  FAQ  to  explain  how 
to  manage  permissions  (Cohen’s  =  0.651).  Some  wanted 
the  applications  to  explain  why  they  need  access  to  certain 
resources.  Some  even  suggested  runtime  prompts  for  every 
sensitive  request  access.  One  participant  suggested  that  app 
developers  hold  a  YouTube  Q&A  session  on  resource  usage 
after  each  release: 

•  “/As  the  app  is  being  introduced  to  the  users,  they  should 
make  a  youtube  q&a  to  answer  any  simple  questions 
like  this.” 

Prior  work  has  already  shown  that  having  runtime  prompts 
on  every  sensitive  request  is  not  feasible  [38] — we  believe 
that  a  log  of  recent  resource  accesses  with  surrounding  con¬ 
text  is  the  closest  practical  solution. 

8.  DISCUSSION 

Our  primary  goal  is  to  empower  users  to  make  privacy  de¬ 
cisions  better  aligned  with  their  preferences  and  to  keep  them 
informed  about  how  third-party  applications  exercise  granted 
permissions,  and  under  what  circumstances.  We 
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performed  iterative  user-centered  design  on  a  new  permis¬ 
sion  management  interface,  TurtleGuard,  which  offers  users 
significant  improvements  in  their  ability  to  control  permis¬ 
sions  when  compared  to  the  default  permission  manager. 

8.1  Auditing  Automated  Decision  Making 

Recent  research  uses  machine-learning  techniques  to  auto¬ 
matically  predict  users’  permission  preferences  [39,  20,  19, 
21].  While  machine-learning  (ML)  techniques  have  been 
shown  to  be  better  at  predicting  users’  preferences  [39],  they 
are  still  prone  to  errors. 

If  systems  are  going  to  use  ML  in  the  future,  there  must  be 
mechanisms  for  users  to  audit  the  decisions  made  on  their 
behalves.  We  believe  that  the  design  we  present  in  our  study 
is  a  critical  first  step  towards  achieving  that  goal.  Partici¬ 
pants  using  TurtleGuard  were  better  able  to  understand  and 
control  when  apps  have  access  to  sensitive  data  than  partic¬ 
ipants  using  the  default  permission  manager.  A  substantial 
proportion  of  participants  mentioned  the  desire  to  have  a  log 
that  they  could  use  to  see  how  each  application  accesses 
sensitive  resources — functionality  that  is  missing  in  the  de¬ 
fault  permission  manager,  but  is  provided  by  TurtleGuard. 

8.2  Correcting  Mental  Models 

In  Task  4,  we  asked  participants  to  disable  access  to  loca¬ 
tion  data  when  the  example  app,  ZvW,  was  not  actively  be¬ 
ing  used,  or  to  explain  that  this  was  not  possible.  We  found 
that  72.5%  of  the  participants  in  the  control  condition  in¬ 
correctly  believed  that  this  was  possible.  Analyzing  the  dif¬ 
ferent  paths  that  participants  in  the  control  condition  took 
while  using  the  Android  simulation,  it  was  evident  that  the 
majority  of  participants  did  not  understand  the  limits  of  the 
permission  interface’s  provided  functionality.  This  mis¬ 
match  between  users’  mental  models  and  actual  functional¬ 
ity  may  lead  to  users  incorrectly  believing  that  they  have 
denied  access  to  certain  requests  for  sensitive  data. 

8.3  Privacy  Nudges 

Previous  work  investigated  ways  to  nudge  users  to  config¬ 
ure  their  privacy  settings  and  make  them  aware  of  how  ap¬ 
plications  access  their  data  [20,  13,  17].  While  helping  mo¬ 
tivate  users  to  use  TurtleGuard  (and  other  privacy  man¬ 
agement  interfaces)  is  important,  it  is  out  of  scope  for  this 
work.  Nevertheless,  our  survey  results  showed  that  63.1%  of 
participants — independent  of  condition — previously  searched 
for  permission  information  on  their  smartphones.  This  shows 
that  users  are  keen  to  understand  how  applications  use  their 
sensitive  resources,  andinterfacessimilarto  the  one  we  present 
in  this  study  fill  a  critical  need. 

8.4  Limitations 

In  our  proposed  interface,  TurtleGuard,  we  allow  users  to 
vary  their  decisions  based  on  the  visibility  of  the  requesting 
application.  We  believe  this  is  a  significant  first  step  to¬ 
wards  enabling  users  to  make  contextual  privacy  decisions. 
The  full  extent  of  the  impact  of  the  surrounding  context, 
however,  goes  beyond  the  mere  visibility  of  the  requesting 
application.  More  work  is  needed  to  understand  different 
contextual  factors  and  their  respective  impact  on  users’  pri¬ 
vacy  decisions.  We  hope  the  results  of  this  study  will  pave  the 
path  for  future  work  on  implementing  fully  contextually 
aware  permission  managers. 


Additionally,  we  acknowledge  the  limitations  in  our  screen¬ 
ing  process:  participants  who  selected  Android  as  their  mo¬ 
bile  device  may  have  varying  levels  of  usage  and  knowledge 
regarding  the  platform.  Prior  experience  may  have  rendered 
the  default  permission  manager  as  being  easier  to  use  for 
some  participants  in  the  control  condition.  This  suggests  that 
for  new  Android  users,  the  usability  improvements  of 
TurtleGuard  may  be  even  greater  than  what  we  observed. 

We  also  acknowledge  that  irregularities  in  the  simulation 
may  have  had  an  impact  towards  participants’  comprehen¬ 
sion  and  completion  rates.  These  confounding  factors  intro¬ 
duced  by  the  UI,  however,  would  have  impacted  both  con¬ 
ditions  equally,  because  the  control  condition  was  simulated 
using  the  same  infrastructure  and  development  environment. 
Finally,  for  users  in  the  control  condition.  Task  4  may  have 
been  deceptively  tricky  due  to  its  impossibility.  Neverthe¬ 
less,  the  incorrect  answers  underscore  a  very  real  problem: 
Android  users  are  not  aware  that  they  are  unable  to  deny 
resources  to  applications  that  they  are  not  using. 

8.5  Conclusion 

Android’s  existing  permission  models,  ask-on-install  (AOI) 
and  ask-on-first-use  (AOFU),  are  insufficient  at  fulfilling  users’ 
privacy  desires  and  needs.  Neither  of  the  existing  models 
account  for  contextual  factors  in  their  decisions  to  allow  or 
deny  access  to  sensitive  data.  Users  want  to  protect  their 
sensitive  information,  but  have  a  hard  time  understanding 
when  access  to  data  is  and  is  not  being  allowed.  TurtleGuard 
adds  both  ease  of  use  and  functionality,  including  the  ability  to 
consider  application  visibility  when  specifying  privacy 
preferences,  which  has  been  shown  to  be  a  strong  contextual 
cue.  In  our  study  of  TurtleGuard,  we  had  participants 
perform  permission-related  tasks  and  compared  their  perfor¬ 
mance  TurtleGuard  with  a  control  group  using  the  default 
permission  manager.  Based  on  our  results,  we  iterated  on 
TurtleGuard’ s  design,  and  then  performed  a  validation  ex¬ 
periment  to  confirm  the  validity  of  our  changes.  Our  results 
show  that  users  are  significantly  better  at  performing  per¬ 
mission  management  tasks  with  TurtleGuard  than  the  de-  fault 
permission  manager. 
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APPENDIX 


Permission 

Expianation 

call  phone 

process_outgoing_calls 
read  phone 
read_call  log 
add_voicemail 
write_  calLlog 

Make  and  process  calls  as  well 
as  read  information  about  call 
status,  network  information  and 
previously  made  phone  calls 

camera 

Access  camera  devices 

get  _accounts 

Access  to  list  of  accounts 

read_calendar 

write_calendar 

Read  and  write  events  to  the 
user’s  calendar 

reacLcon  tacts 
write_con  tacts 

Read  and  write  to  user’ s  con¬ 
tacts 

reacLextemaLstorage 

write_externaLstorage 

Read  and  write  files  to  the  user’ s 
external  storage 

record  -audio 

Record  audio 

access_coarse_location 
access_fine  location 
access,  wifi  state 

Read  location  information  in 
various  ways  including  network 
SSID-based  location 

read  sms 
send,sms 
receive  sms 

Read  SMS  messages  from  the  de¬ 
vice  (including  drafts)  as  well  as 
send  and  receive  new  ones  SMS 

Table  5:  Sensitive  permissions  managed  by  TurtleGuard. 
Permissions  grouped  by  a  single  explanation  form  the  fam¬ 
ilies  used  in  our  system  to  reduce  the  number  of  managed 
permission  as  discussed  in  Section  3. 
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Condition 

Correct 

Incorrect 

All 

Task  1 

control 

2 

3 

2 

experimental 

2 

4 

2 

Task  2 

control 

2 

3 

3 

experimental 

2 

3 

2 

Task  3 

control 

2 

4 

3 

experimental 

2 

3 

2 

Task  4 

control 

4 

2 

3 

experimental 

2 

2 

2 

Table  6;  Median  ease-of-use  Likert  scores  for  all  tasks,  condi¬ 
tions,  and  correctness  in  the  validation  experiment.  Higher 
scores  indicate  more  difficulty. 
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Figure  6:  In  the  pilot  experiment,  TurtleGuard  was  launched 
via  the  icon  labeled  “Privacy  Manager”  (top  left),  but  then 
added  as  a  sub-panel  to  the  Settings  app,  labeled  “Permis¬ 
sions  Manager,”  for  the  validation  experiment  (top  right).  In 
the  control  condition  in  the  pilot  experiment  and  both  con¬ 
ditions  in  the  validation  experiment,  the  Settings  app  was 
accessible  from  the  home  screen  (bottom). 
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Figure  8:  Ease  of  use  histogram  for  Task  1  (validation  ex¬ 
periment) 
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Figure  9:  Ease  of  use  histogram  for  Task  2  (validation  ex-  Figure  10;  Ease  of  use  histogram  for  Task  3  (validation  ex¬ 
periment)  periment) 
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Task  4:  East  of  i 
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Figure  1 1 :  Ease  of  use  histogram  for  Task  4  (validation  ex¬ 
periment) 
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AOFU 

API 

BYOD 

CDMA 

ESM 

GPS 

GSM 

HTTPS 

ML 

NFC 

OS 

SSID 

SMS 

SVM 

UI 


List  of  Symbols,  Abbreviations  and  Acronyms 

Ask  On  First  Use 

Application  Programming  Interface 

Bring  Your  Own  Device 

Code  Division  Multiple  Access 

Experience  Sampling  Method 

Global  Positioning  System 

Global  System  for  Mobiles 

Hyper  Text  Transfer  Protocol  Secure 

Machine-learning 

Near  field  communication 

Operating  System 

Service  Set  Identifier 

Short  Message  Service 

Support  Vector  Machines 

User  Interface 
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Glossary  of  Terminology 


BYOD  Bring  Your  Own  Device  Practice  where  employees  and  workers  can 

bring  their  own  mobile  devices  into  the  workplace  to  be  used  for  both  work  and  personal 
purposes. 

Background  Application  Application  that  is  running  on  a  mobile  device  but  is 
not  readily  visible  to  the  end  user. 

Classifier  The  Classifier  is  a  system  that  looks  at  user  preferences  (privacy 
preferences  in  the  case  of  this  report)  and  then  recommends  actions  to  perform  on  the 
behalf  of  the  end  user  (e.g.  deny  access  to  a  specific  application  for  contact  information). 
Over  time,  the  classifier  learns  how  to  correctly  classify  user  preferences  and  associate 
these  with  actions  that  it  should  perform  for  the  end  user. 

Android  The  operating  system  at  the  time  of  this  writing  that  is  developed  by 
Google  used  on  mobile  phones. 

Dashboard  The  user  interface  we  provided  for  allowing  users  to  have  control  over 
the  privacy  settings  for  their  applications  on  the  Android  phones 

Lo-Fidelity  prototypes  Lo-Fidelity  prototypes  are  prototypes  created  for  user 
testing  that  display  minimal  information  and  are  not  entirely  implemented. 

Hi-Fidelity  prototypes  Hi-Fidelity  prototypes  are  prototypes  created  for  user 
testing  that  look  and  in  some  cases  act  as  if  they  are  implemented,  however  they  still 
lack  certain  functionality  that  would  make  them  actual  implementations. 
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